简介:Atpresent,itisprojectedthatabout4zettabytes(or10**21bytes)ofdigitaldataarebeinggeneratedperyearbyeverythingfromundergroundphysicsexperimentstoretailtransactionstosecuritycamerastoglobalpositioningsystems.IntheU.S.,majorresearchprogramsarebeingfundedtodealwithbigdatainallfivesectors(i.e.,services,manufacturing,construction,agricultureandmining)oftheeconomy.BigDataisatermappliedtodatasetswhosesizeisbeyondtheabilityofavailabletoolstoundertaketheiracquisition,access,analyticsand/orapplicationinareasonableamountoftime.WhereasTien(2003)forewarnedaboutthedatarich,informationpoor(DRIP)problemsthathavebeenpervasivesincetheadventoflarge-scaledatacollectionsorwarehouses,theDRIPconundrumhasbeensomewhatmitigatedbytheBigDataapproachwhichhasunleashedinformationinamannerthatcansupportinformed-yet,notnecessarilydefensibleorvalid-decisionsorchoices.Thus,bysomewhatovercomingdataqualityissueswithdataquantity,dataaccessrestrictionswithon-demandcloudcomputing,causativeanalysiswithcorrelativedataanalytics,andmodel-drivenwithevidence-drivenapplications,appropriateactionscanbeundertakenwiththeobtainedinformation.Newacquisition,access,analyticsandapplicationtechnologiesarebeingdevelopedtofurtherBigDataasitisbeingemployedtohelpresolvethe14grandchallenges(identifiedbytheNationalAcademyofEngineeringin2008),underpinthe10breakthroughtechnologies(compiledbytheMassachusettsInstituteofTechnologyin2013)andsupporttheThirdIndustrialRevolutionofmasscustomization.
简介:
简介:ThedesignofanOLAPsystemforsupportingreal-timequeriesisoneofthemajorresearchissues.Oneapproachistousedatacubes,whicharepre-computedmultidimensionalviewsofdatainthedatawarehouse.Aninitialsetofdatacubescanbederived.fromwhichtheanswertoeachfrequentlyaskedquerycanberetrieveddirectly.However,therearetwopracticalproblemsconcerningthedesignofacubebasedsystem:1)themaintenancecostofthedatacubes,and2)thequerycosttoansweraselectedsetoffrequentlyaskedqueries.MaintainingadatacuberequiresdiskstorageandCPUcomputation,Sothemaintenancecostisrelatedtothetotalsizeofthedatacubesmaterialized,andthuskeepingalldatacubesisimpractical.Thetotalsizeofcubesmaybereducedbymergingsomecubes.However,theresultinglargercubeswillincreasethequerycostofansweringsomequeries.Iftheboundsonmaintenancecostandquerycostarestrict.someofthequeriesneedtobesacrificed.Anoptimizationproblemindatacubesystemdesignhasbeendefined.Withamaintenance-costboundandaquery-costboundgivenbytheuser,itisnecessarytoopti-mizetheinitialsetofdatacubessuchthatthesystemcanansweramaximumnumberofqueriesandsatisfythebounds.ThisisanNP-completeproblem.ApproximatealgorithmsGreedyRemoving(GR)and2-GreedyMergingwithMultiplepaths(2GGM)areproposed.Experimentshavebeendoneonacensusdatabaseandtheresultsshowthatourapproachinbotheffbctiveandefficient.
简介:Mostoftheearlierworkonclusteringmainlyfocusedonnumericdatawhoseinherentgeometricpropertiescanbeexploitedtonaturallydefinedistancefunctionsbetweendatapoints.However,dataminingapplicationsfrequentlyinvolvemanydatasetsthatalsoconsistsofmixednumericandcategoricalattributes.Inthispaperwepresentaclusteringalgorithmwhichisbasedonthek-meansalgorithm.Thealgorithmclustersobjectswithnumericandcategoricalattributesinawaysimilartok-means.Theobjectsimilaritymeasureisderivedfrombothnumericandcategoricalattributes.Whenappliedtonumericdata,thealgorithmisidenticaltothek-means.Themainresultofthispaperistoprovideamethodtoupdatethe'clustercenters'ofclusteringobjectsdescribedbymixednumericandcategoricalattributesintheclusteringprocesstominimisetheclusteringcostfunction.Theclusteringperformanceofthealgorithmisdemonstratedwiththetwowellknowndatasets,namelycreditapprovalandabalonedatabases.
简介:它仅仅是能在数据被存储的真实世界的可见部分。为如此的不完全、组织病的数据,结晶的数据瞄准atpresenting在包括unobservable事件的事件之中的隐藏的结构。,这被数据结晶化认识到哑巴项目,相应于unobservable事件的潜在的存在,被插入到给定的数据。有可见事件的这些哑巴项目和他们的关系被applyingKeyGraph与哑巴项目设想到数据,象灰尘涉及水分子的结晶化的形成的雪的结晶化一样。为调节要设想的结构的颗粒度水平,数据结晶化的工具与人在真实世界上理解重要情形的过程是综合的。这个基本方法被期望为机会发现的以前的方法带人到成功的决策的各种各样的真实世界领域适用。在这篇论文,我们在一个真实公司与human-interactiveannealing(DCHA)把数据结晶化用于产品的设计。结果显示出它的效果到工业决策。
简介:Autocorrelationisprevalentincontinuousproductionprocesses,suchastheprocessesinthechemicalandpharmaceuticalindustries.Withthedevelopmentofmeasurementtechnologyanddataacquisitiontechnology,samplingfrequencyisgettinghigherandtheexistenceofautocorrelationcannotbeignored.Thispaperanalyzesfiveestimationschemesofprocesscapabilityforautocorrelateddata.Comparisonsamongtheseschemesarediscussedforsmallsampleandlargesample.Inconclusion,thispapergivesaprocedureofprocesscapabilityanalysisforautocorrelateddata.
简介:Solvingcomplexdecisionproblemsrequirestheusageofinformationfromdifferentsources.Usuallythisinformationisuncertainandstatisticalorprobabilisticmethodsareneededforitsprocessing.However,inmanycasesadecisionmakerfacesnotonlyuncertaintyofarandomnaturebutalsoimprecisioninthedescriptionofinputdatathatisratheroflinguisticnature.Therefore,thereisaneedtomergeuncertaintiesofbothtypesintoonemathematicalmodel.Inthepaperwepresentmethodologyofmerginginformationfromimpreciselyreportedstatisticaldataandimpreciselyformulatedfuzzypriorinformation.Moreover,wealsoconsiderthecaseofimpreciselydefinedlossfunctions.Theproposedmethodologymaybeconsideredastheapplicationoffuzzystatisticalmethodsforthedecisionmakinginthesystemsanalysis.
简介:Itisanimportanttasktoanalyzethescheduleriskinaprojectmanagement.Asasemi-constructedornon-constructedcomplexsystem,therearemanydifficultiesinthequantitativeanalysisoftheschedulerisk(SRA).Thepaperintegratesintelligenttechniquestoobtainmassivebasicdatarequiredintheriskanalysisprocess.ItgreatlyimprovestheprecisionandefficiencyoftheSRA.Inaddition,thepaperpresentsamechanismandarchitectureoftheintegratedintelligentsystems.Finally,theconcludingremarksareprovidedforbasicdataacquisitionintheSRA.
简介:Inthispaper,forzero-failuredata(ti,ni),atmomentti,ifthepriordistributionofthefailureprobabilitypi=P{T<ti}isincompleteFisher-Zdistribution:Fisher-Z(0,λi;a,b),theauthorgivespihierarchicalBiyesianestimationandtheestimationofreliabilityunderzero-failuredataconditionisobtainedalso.Theauthoralsogivesapracticalcalculatingexampleusingthetheory.
简介:Largesizedpowertransformersareimportantpartsofthepowersupplychain.Theseverycriticalnetworksofengineeringassetsareanessentialbaseofanation'senergyresourceinfrastructure.Thisresearchidentifiesthekeyfactorsinfluencingtransformernormaloperatingconditionsandpredictstheassetmanagementlifespan.Engineeringassetresearchhasdevelopedfewlifespanforecastingmethodscombiningreal-timemonitoringsolutionsfortransformermaintenanceandreplacement.Utilizingtherichdatasourcefromaremoteterminalunit(RTU)systemforsensor-datadrivenanalysis,thisresearchdevelopsaninnovativereal-timelifespanforecastingapproachapplyinglogisticregressionbasedontheWeibulldistribution.Themethodologyandtheimplementationprototypeareverifiedusingadataseriesfrom161kVtransformerstoevaluatetheefficiencyandaccuracyforenergysectorapplications.Theassetstakeholdersandsupplierssignificantlybenefitfromthereal-timepowertransformerlifespanevaluationformaintenanceandreplacementdecisionsupport.
简介:Inthispaper,twokindsofKullback-Leiblercriteriawithappropriateconstraintsareproposedtoconstructempiricallikelihoodconfidenceintervalsforthemeanofrightcensoreddata.ItisshownthatoneofthecriteriaisequivalenttoAdimari's(1997)procedure,andtheothersharesthesameasymptoticbehavior.
简介:有与采样为非线性的系统的一个类活跃骚乱拒绝控制(ADRC)调节的参数的担心快不足够评估的这份报纸。理论结果显示出在采样率之间的量的关系,ADRC的参数,在系统和靠近环的系统的性质的无常的尺寸。而且,在给定的采样率下面的样品数据ADRC的能力是讨论的份量上。
简介:Adatawarehouseoftenaccommodatesenormoussummaryinformationinvariousgranularitiesandismainlyusedtosupporton-lineanalyticalprocessing.Ideallyalldetaileddatashouldbeaccessiblebyresidinginsomelegacysystemsoron-linetransactionprocessingsystems.Inmanycases,however,datasourcesincomputersarealsokindsofsummarydataduetotechnologicalproblemsorbudgetlimitsandalsobecausedifferentaggregationhierarchiesmayneedtobeusedamongvarioustransactionsystems.Insuchcircumstances,itisnecessarytoinvestigatehowtodesigndimensions,whichplayamajorroleindimensionalmodelforadatawarehouse,andhowtoestimatesummaryinformation,whichisnotstoredinthedatawarehouse.Inthispaper,theroughsettheoryisappliedtosupportthedimensiondesignandinformationestimation.
简介:WiththedeliveryofagreatdealremotesensingdatatolandfromLandsatconstantly,RemoteSensingSatelliteGroundStationaccumulatesabundantsatelliteremotesensingdata.Forlackofeffectivedatamining(DM)andknowledgeDiscoveryfromDatabases(KDDtechnique)tothesedata,mostpartoftheinformationcannotbeusedefficiently.TechnicalinnovationandimprovementofthetraditionalDMandKDD,studyofthedataminingandKDDwillbothincreasetheinterpretationlevelandintelligentized,andmoreoverexploreandutilizetheremotesensinginformationatthemaximumdegree.BasedonthetraditionaldataminingandKDD,theauthorsprobedthetechnicalflowofDMandKDDoftheremotesensing,designedthesystematicalframeworkofmulti-sourcesremotesensingDM,putforwardaprototypeEstablishedabaseforfurtherexploringandsystem.ofmulti-sourcesremotesensingDMsystem.developingmulti-sourcesremotesensingDMsystem.