学科分类
/ 8
154 个结果
  • 简介:Atpresent,itisprojectedthatabout4zettabytes(or10**21bytes)ofdigitaldataarebeinggeneratedperyearbyeverythingfromundergroundphysicsexperimentstoretailtransactionstosecuritycamerastoglobalpositioningsystems.IntheU.S.,majorresearchprogramsarebeingfundedtodealwithbigdatainallfivesectors(i.e.,services,manufacturing,construction,agricultureandmining)oftheeconomy.BigDataisatermappliedtodatasetswhosesizeisbeyondtheabilityofavailabletoolstoundertaketheiracquisition,access,analyticsand/orapplicationinareasonableamountoftime.WhereasTien(2003)forewarnedaboutthedatarich,informationpoor(DRIP)problemsthathavebeenpervasivesincetheadventoflarge-scaledatacollectionsorwarehouses,theDRIPconundrumhasbeensomewhatmitigatedbytheBigDataapproachwhichhasunleashedinformationinamannerthatcansupportinformed-yet,notnecessarilydefensibleorvalid-decisionsorchoices.Thus,bysomewhatovercomingdataqualityissueswithdataquantity,dataaccessrestrictionswithon-demandcloudcomputing,causativeanalysiswithcorrelativedataanalytics,andmodel-drivenwithevidence-drivenapplications,appropriateactionscanbeundertakenwiththeobtainedinformation.Newacquisition,access,analyticsandapplicationtechnologiesarebeingdevelopedtofurtherBigDataasitisbeingemployedtohelpresolvethe14grandchallenges(identifiedbytheNationalAcademyofEngineeringin2008),underpinthe10breakthroughtechnologies(compiledbytheMassachusettsInstituteofTechnologyin2013)andsupporttheThirdIndustrialRevolutionofmasscustomization.

  • 标签: 信息 全球定位系统 应用程序 编译技术 数据集合 模型驱动
  • 简介:ThedesignofanOLAPsystemforsupportingreal-timequeriesisoneofthemajorresearchissues.Oneapproachistousedatacubes,whicharepre-computedmultidimensionalviewsofdatainthedatawarehouse.Aninitialsetofdatacubescanbederived.fromwhichtheanswertoeachfrequentlyaskedquerycanberetrieveddirectly.However,therearetwopracticalproblemsconcerningthedesignofacubebasedsystem:1)themaintenancecostofthedatacubes,and2)thequerycosttoansweraselectedsetoffrequentlyaskedqueries.MaintainingadatacuberequiresdiskstorageandCPUcomputation,Sothemaintenancecostisrelatedtothetotalsizeofthedatacubesmaterialized,andthuskeepingalldatacubesisimpractical.Thetotalsizeofcubesmaybereducedbymergingsomecubes.However,theresultinglargercubeswillincreasethequerycostofansweringsomequeries.Iftheboundsonmaintenancecostandquerycostarestrict.someofthequeriesneedtobesacrificed.Anoptimizationproblemindatacubesystemdesignhasbeendefined.Withamaintenance-costboundandaquery-costboundgivenbytheuser,itisnecessarytoopti-mizetheinitialsetofdatacubessuchthatthesystemcanansweramaximumnumberofqueriesandsatisfythebounds.ThisisanNP-completeproblem.ApproximatealgorithmsGreedyRemoving(GR)and2-GreedyMergingwithMultiplepaths(2GGM)areproposed.Experimentshavebeendoneonacensusdatabaseandtheresultsshowthatourapproachinbotheffbctiveandefficient.

  • 标签: OLAP DSS 立方数据 系统设计 近似算法 决策支持系统
  • 简介:Mostoftheearlierworkonclusteringmainlyfocusedonnumericdatawhoseinherentgeometricpropertiescanbeexploitedtonaturallydefinedistancefunctionsbetweendatapoints.However,dataminingapplicationsfrequentlyinvolvemanydatasetsthatalsoconsistsofmixednumericandcategoricalattributes.Inthispaperwepresentaclusteringalgorithmwhichisbasedonthek-meansalgorithm.Thealgorithmclustersobjectswithnumericandcategoricalattributesinawaysimilartok-means.Theobjectsimilaritymeasureisderivedfrombothnumericandcategoricalattributes.Whenappliedtonumericdata,thealgorithmisidenticaltothek-means.Themainresultofthispaperistoprovideamethodtoupdatethe'clustercenters'ofclusteringobjectsdescribedbymixednumericandcategoricalattributesintheclusteringprocesstominimisetheclusteringcostfunction.Theclusteringperformanceofthealgorithmisdemonstratedwiththetwowellknowndatasets,namelycreditapprovalandabalonedatabases.

  • 标签: 数据挖掘 数字数据 分类数据 聚类算法 数据库 数据集
  • 简介:它仅仅是能在数据被存储的真实世界的可见部分。为如此的不完全、组织病的数据,结晶的数据瞄准atpresenting在包括unobservable事件的事件之中的隐藏的结构。,这被数据结晶化认识到哑巴项目,相应于unobservable事件的潜在的存在,被插入到给定的数据。有可见事件的这些哑巴项目和他们的关系被applyingKeyGraph与哑巴项目设想到数据,象灰尘涉及水分子的结晶化的形成的雪的结晶化一样。为调节要设想的结构的颗粒度水平,数据结晶化的工具与人在真实世界上理解重要情形的过程是综合的。这个基本方法被期望为机会发现的以前的方法带人到成功的决策的各种各样的真实世界领域适用。在这篇论文,我们在一个真实公司与human-interactiveannealing(DCHA)把数据结晶化用于产品的设计。结果显示出它的效果到工业决策。

  • 标签: 新产品 设计 人机互动 数据具体化
  • 简介:Autocorrelationisprevalentincontinuousproductionprocesses,suchastheprocessesinthechemicalandpharmaceuticalindustries.Withthedevelopmentofmeasurementtechnologyanddataacquisitiontechnology,samplingfrequencyisgettinghigherandtheexistenceofautocorrelationcannotbeignored.Thispaperanalyzesfiveestimationschemesofprocesscapabilityforautocorrelateddata.Comparisonsamongtheseschemesarediscussedforsmallsampleandlargesample.Inconclusion,thispapergivesaprocedureofprocesscapabilityanalysisforautocorrelateddata.

  • 标签: 过程能力分析 自相关 预测计划 连续生产过程 数据采集技术 制药过程
  • 简介:Solvingcomplexdecisionproblemsrequirestheusageofinformationfromdifferentsources.Usuallythisinformationisuncertainandstatisticalorprobabilisticmethodsareneededforitsprocessing.However,inmanycasesadecisionmakerfacesnotonlyuncertaintyofarandomnaturebutalsoimprecisioninthedescriptionofinputdatathatisratheroflinguisticnature.Therefore,thereisaneedtomergeuncertaintiesofbothtypesintoonemathematicalmodel.Inthepaperwepresentmethodologyofmerginginformationfromimpreciselyreportedstatisticaldataandimpreciselyformulatedfuzzypriorinformation.Moreover,wealsoconsiderthecaseofimpreciselydefinedlossfunctions.Theproposedmethodologymaybeconsideredastheapplicationoffuzzystatisticalmethodsforthedecisionmakinginthesystemsanalysis.

  • 标签: BAYES decisions imprecise information FUZZY STATISTICAL
  • 简介:Itisanimportanttasktoanalyzethescheduleriskinaprojectmanagement.Asasemi-constructedornon-constructedcomplexsystem,therearemanydifficultiesinthequantitativeanalysisoftheschedulerisk(SRA).Thepaperintegratesintelligenttechniquestoobtainmassivebasicdatarequiredintheriskanalysisprocess.ItgreatlyimprovestheprecisionandefficiencyoftheSRA.Inaddition,thepaperpresentsamechanismandarchitectureoftheintegratedintelligentsystems.Finally,theconcludingremarksareprovidedforbasicdataacquisitionintheSRA.

  • 标签: 风险分析 调度风险 数据采集 项目管理 SRA 智能系统
  • 简介:Inthispaper,forzero-failuredata(ti,ni),atmomentti,ifthepriordistributionofthefailureprobabilitypi=P{T<ti}isincompleteFisher-Zdistribution:Fisher-Z(0,λi;a,b),theauthorgivespihierarchicalBiyesianestimationandtheestimationofreliabilityunderzero-failuredataconditionisobtainedalso.Theauthoralsogivesapracticalcalculatingexampleusingthetheory.

  • 标签: RELIABILITY zero-failure data FAILURE PROBABILITY HIERARCHICAL
  • 简介:Largesizedpowertransformersareimportantpartsofthepowersupplychain.Theseverycriticalnetworksofengineeringassetsareanessentialbaseofanation'senergyresourceinfrastructure.Thisresearchidentifiesthekeyfactorsinfluencingtransformernormaloperatingconditionsandpredictstheassetmanagementlifespan.Engineeringassetresearchhasdevelopedfewlifespanforecastingmethodscombiningreal-timemonitoringsolutionsfortransformermaintenanceandreplacement.Utilizingtherichdatasourcefromaremoteterminalunit(RTU)systemforsensor-datadrivenanalysis,thisresearchdevelopsaninnovativereal-timelifespanforecastingapproachapplyinglogisticregressionbasedontheWeibulldistribution.Themethodologyandtheimplementationprototypeareverifiedusingadataseriesfrom161kVtransformerstoevaluatetheefficiencyandaccuracyforenergysectorapplications.Theassetstakeholdersandsupplierssignificantlybenefitfromthereal-timepowertransformerlifespanevaluationformaintenanceandreplacementdecisionsupport.

  • 标签: 大型电力变压器 寿命评估 数据驱动 LOGISTIC回归 WEIBULL分布 资产管理
  • 简介:Inthispaper,twokindsofKullback-Leiblercriteriawithappropriateconstraintsareproposedtoconstructempiricallikelihoodconfidenceintervalsforthemeanofrightcensoreddata.ItisshownthatoneofthecriteriaisequivalenttoAdimari's(1997)procedure,andtheothersharesthesameasymptoticbehavior.

  • 标签: 随机变量 经验似然 Kullback-Leibler准则
  • 简介:Adatawarehouseoftenaccommodatesenormoussummaryinformationinvariousgranularitiesandismainlyusedtosupporton-lineanalyticalprocessing.Ideallyalldetaileddatashouldbeaccessiblebyresidinginsomelegacysystemsoron-linetransactionprocessingsystems.Inmanycases,however,datasourcesincomputersarealsokindsofsummarydataduetotechnologicalproblemsorbudgetlimitsandalsobecausedifferentaggregationhierarchiesmayneedtobeusedamongvarioustransactionsystems.Insuchcircumstances,itisnecessarytoinvestigatehowtodesigndimensions,whichplayamajorroleindimensionalmodelforadatawarehouse,andhowtoestimatesummaryinformation,whichisnotstoredinthedatawarehouse.Inthispaper,theroughsettheoryisappliedtosupportthedimensiondesignandinformationestimation.

  • 标签: ROUGH SETS data WAREHOUSE dimension
  • 简介:在一个D输入Hammerstein系统的multivariate非线性的nonparametric鉴定的问题被检验。如果输入大小被组织,在意义,那在那里存在,这被表明在他们之间的某隐藏的关系,即如果他们在一些上是分布式的(未知)在R的d维的空格MD,dD,然后,系统非线性能与集中率O在M上在点被恢复(nd上的1/(2+d))依赖者。这率因此比通用的率O快(n<啜class=“a-plus-plus”>1/(2+D))由典型nonparametric算法完成了并且完全由输入D的数字控制了。

  • 标签: HAMMERSTEIN 非参数辨识 结构化数据 MISO 系统 多元非线性
  • 简介:这份报纸由利用连续辅助covariate信息当主要covariate仅仅为随机选择的subsample被查明时,改进统计推理的效率认为添加剂危险回归是分析。作者为回归参数构造一个基于鞅的估计方程并且建立asymptotic一致性和结果的评估者的规度。模拟学习证明建议方法罐头极大地与在许多背景丢弃辅助covariate信息的评估者相比改进效率。一个真实例子也作为一幅插图被提供。

  • 标签: 回归分析 风险 加性 随机选择 统计推断 回归参数
  • 简介:WiththedeliveryofagreatdealremotesensingdatatolandfromLandsatconstantly,RemoteSensingSatelliteGroundStationaccumulatesabundantsatelliteremotesensingdata.Forlackofeffectivedatamining(DM)andknowledgeDiscoveryfromDatabases(KDDtechnique)tothesedata,mostpartoftheinformationcannotbeusedefficiently.TechnicalinnovationandimprovementofthetraditionalDMandKDD,studyofthedataminingandKDDwillbothincreasetheinterpretationlevelandintelligentized,andmoreoverexploreandutilizetheremotesensinginformationatthemaximumdegree.BasedonthetraditionaldataminingandKDD,theauthorsprobedthetechnicalflowofDMandKDDoftheremotesensing,designedthesystematicalframeworkofmulti-sourcesremotesensingDM,putforwardaprototypeEstablishedabaseforfurtherexploringandsystem.ofmulti-sourcesremotesensingDMsystem.developingmulti-sourcesremotesensingDMsystem.

  • 标签: 数据提炼 遥感数据 知识发现 KDD
  • 简介:在这篇论文,我们为用模糊的熵识别相关潜水艇空格建议一个新奇方法并且表现聚类。这项措施由使用会员功能测量班火柴度更好区别真实分发。因此模糊的熵在潜水艇空格在模式的实际分发反映更多的信息。我们基于剪影标准使用一个启发式的过程发现簇的数字。介绍理论和算法通过许多基准数据集合的实验被评估。实验结果与几个另外的聚类的算法比较显示出它的有利表演。

  • 标签: 聚类算法 模糊熵 高维数据 分配格局 隶属函数 子空间