Feature selection for classification

2023年12月19日发(作者：片假名转换器)

IntelligentDataAnalysis1(1997)131–/locate/idaFeatureSelectionforClassiﬁ1,2DepartmentofInformationSystems&ComputerScience,NationalUniversityofSingapore,Singapore119260Received24January1997;revised3March1997;accepted21March1997AbstractFeatureselectionecreationofhugedatabasesandtheconsequentrequirementsforgoodmachinelearningtechniques,rveyisacomprehensiveoverviewofmanyexistingmethodsfromthe1970’tiﬁesfourstepsofatypicalfeatureselectionmethod,andcategorizesthedifferentexistingmethodsintermsofgenerationproceduresandevaluationfunctions,andrevealshithertounatentativemethodsarechosarinesforapplyingfeaturesrveyidentiﬁesthefutureresearchareasinfeatureselection,introducesnewcomerstothisﬁeld,andpavesthewayforpractitionerswhosearchforsuitablemethodsforsolvingdomain-speciﬁcreal-worldapplications.(IntelligentDataAnalysis,Vol.1,no.3,http:/locate/ida)©ds:Featureselection;Classiﬁcation;uctionThemajorityofreal-worldclassiﬁcationproblemsrequiresupervisedlearningwheretheunderlyingclassprobabilitiesandclass-conditionalprobabilitiesareunknown,-worldsituations,ore,unatelymanyoftheseareeitherpartiallyorcompletelyirrelevant/antfeatureisneitherirrelevantnorredundanttothetargetconcept;anirrelevantfeaturedoesnotaffectthetargetconceptinanyway,andaredundantfeaturedoesnotaddanythingnewtothetargetconcept[21].Inmanyapplications,thesizeofadatasetissolargethngthenumberofirrelevant/redundantfeaturesdrasticallyreducestherunningtimeofalearning1E-mail:manoranj@.2E-mail:liuh@.1088-467X/97/$17.00©:S1088-467X(97)00008-5

,/IntelligentDataAnalysis1(1997)131–lpsingettingabetterinsightintotheunderlyingconceptofareal-worldclassiﬁcationproblem[23,24].Featureselectionmeteselectionisdeﬁxpected,manyofthosearesimilarinintuitionand/lowingliststhosethatareconceptuallydifferentandcoverarangeofdeﬁzed:ﬁndtheminimallysizedfeaturesubsetthatisnecessaryandsufﬁcienttothetargetconcept[22].cal:selectasubsetofMfeaturesfromasetofNfeatures,M

,/IntelligentDataAnalysis1(1997)131–tasuitablestoppingcriterionthefeatureseletionproceduresandevaluationfunctionscaninﬂngcriteriabasedonagenerationprocedureinclude:(i)whetherapredeﬁnednumberoffeaturesareselected,and(ii)whetherapredeﬁngcriteriabasedonanevaluationfunctioncanbe:(i)whetheraddition(ordeletion)ofanyfeaturedoesnotproduceabettersubset;and(ii)whpcontinuesuntilsomestoppingcriterionissatisﬁtureselectionprocesshremanyvariationstothisthree-stepfeatureselectionprocess,idationprocedureisnotapartofthefeatureselectionprocessitself,butafeatureselectionmethod(inpractice)stotestthevalidityoftheselectedsubsetbycarryingoutdifferenttests,andcomparingtheresultswithpreviouslyestablishedresults,orwiththeresultsofcompetingfeatureselectionmethodsusingartiﬁcialdatasets,real-worlddatasets,avebeenquiteafewattempentamongtheseareDoak’s[13]andSiedleckiandSklansky’s[46]ckiandSklanskydiscussedtheevolutionoffeatureselectionmethodsandgroupedthemethodsintopast,present,ainfocuswasthebranchandboundmethods[34]anditsvariants,[16].urveywaspublishedintheyear1987,andsincethenmanynewandefﬁ,Focus[2],Relief[22],LVF[28]).DoakfollowedasimilarapproachtoSiedleckiandSklansky’ssurveyandgroupedthedifferentsearchalgorithmsandevaluationfunctionsusedinfeatureselectionmethodsindependently,andranexperimenarticle,asurveyisconductedforfeatureselectionmethodsstartingfromtheearly1970’s[33]tothemostrecentmethods[28].Inthenextsection,thetwomajorstepsoffeatureselection(generationprocedureandevaluationfunction)aredividedintodifferentgroups,and32differentfeatureselectionmethodsarecategorizedbasedonthetypeofgenerationprocedureandevaluationfunctionthatisused.

,/IntelligentDataAnalysis1(1997)131–156Thisframeworkhelpsinﬁndingtheuneion3webrieﬂydiscussthemethodsundereachcategory,andselectn4describesanempiricalcomparisonoftherepresentativemethodsusingthreeartiﬁcialdatasetssuitablychosentohighlighttheirbeneﬁn5consistsofdiscussionsonvariousdatasetcharacteristicsthatinﬂuencethechoiceofasuitablefeatureselectionmethod,andsomeguidelines,regardinghowtochooseafeatureselectionmethodforanapplicationathand,aregivenbperconcludesinSection6withfurtherdiscussionsonfutureresearchbasedontheﬁectiveisthatthisarticlewillassistinﬁfFeatureSelectionMethodsInthissectionwecategorizethetwomajorstepsoffeatureselection:fereworkispresentedinwhichatotalof32methodsaregroupedbasetionProceduresIftheoriginalfeaturesetcontainsNnumberoffeatures,redifferentapproachesforsolvingthisproblem,namely:complete,heuristic,teThisgenerationproceduredoesacompler,Schlimmer[43]arguesthat“justbecausethesearchmustbecompletedoesnotmeanthatitmustbeexhaustive.”Differentheuristicfunctionsareusedtoreducethesearchwithoutjeopardizingthechancesofﬁ,althoughtheorderofthesearchspaceisO(2N),imalityofthefeaturesubset,accordingtotheevaluationfunction,ackingcanbedoneusingvarioustechniques,suchas:branchandbound,bestﬁrstsearch,ticIneachiterationofthisgenerationprocedure,allremainingfeaturesyettobeselected(rejected)areconsideredforselection(rejection).Therearemanyvariationstothissimpleprocess,butgenerationofsubsetsisbasicallyincremental(eitherincreasingordecreasing).TheorderofthesearchspaceisO(N2)orless;someexceptionsareRelief[22],DTM[9]roceduresareverysimpletoimplementandveryfastinproducingresults,bThisgenerationprocedureisrathernewighthesearchspaceisO(2N),butthesemethodstypicallysearchfewernumberof

,/IntelligentDataAnalysis1(1997)131–mentofsuitabletionFunct,anoptimalsubsetchosenusingoneevaluationfunctionmaynotbethesameasthatwhichusesanotherevaluationfunction).Typically,anevaluationfunctiontriestomeasurethediscriminatiny[26],ﬁlterandwrapper)basedontheirdependenceontheinductivealgorithmthatwillﬁmethodsareindependentoftheinductivealgorithm,where-Bassat[4]groupedtheevaluationfunctionsexistinguntil1982intothreecategories:informationoruncertaintymeasures,distancemeasures,anddependencemeasures,andsuggestedthatthedependencemeasurescanbedividedbetweentheﬁotconsideredtheclassiﬁcationerrorrateasanevaluationfunction,[13]dividedtheevaluationfunctionsintothreecategories:dataintrinsicmeasures,classiﬁcationerrorrate,andestimatedorincrementalerrorrate,aintrinsiccategoryincludesdistance,entropy,eringthesedivisionsandthelatestdevelopments,wedividetheevaluationfunctionsintoﬁvecategories:distance,information(oruncertainty),dependence,consistency,andclassiﬁollowingsubsectionswebrieﬂceMeasuresItisalsoknownasseparability,divergence,o-classproblem,afeatureXispreferredtoanotherfeatureYifXinducesagreaterdifferencebetweenthetwo-classconditionalprobabilitiesthanY;ifthedifferenceiszero,ationMeasurormationgainfromafeatureXisdeﬁnedasthedifferenceeXispreferredtofeatur,entropymeasure)[4].enceMeasuresDependencemeasuresorcorrelationmeasuresqualifﬁcientisaclassicaldependencemeasureandcanbeusedtoﬁorrelationoffeatureXwithclassCishigherthanthecorrelationoffeatureYwithC,tvariationofthisistodeterminethedependenceofafeatureonotherfeatures;luationfunctionsbasedondependencemeasurescanbedividedbetweendistanceandinformation

,/IntelligentDataAnalysis1(1997)131–156Table1AComparisonofEvaluationFunctionsEvaluationFunctionDistanceMeasureInformationMeasureDependenceMeasureConsistencyMeasureClassiﬁerErrorRateGeneralityYesYesYesYesNoTimeComplexityLowLowLowModerateHighAccuracy––––,thesearestillkeptasaseparatecategory,becauseconceptually,theyrepresentadifferentviewpoint[4].MoreabouttheabovethreemeasurescanbefoundinBen-Basset’s[4]tencyrecharacteristicallydifferentfromothermeasures,becauseoftheirheavyrelianceonthetrainingdatasetandtheuseoftheMin-Featuresbiasinselectingasubsetoffeatures[3].Min-Featuresbiasprefersconsistenthypothesesdeﬁeasuresﬁndouttheminimallysizedsubsetthatsatisﬁestheacceptableinconsistencyrate,ﬁerErrorRateMeasuresThemethodsusingthistypeofevaluationfunctionarecalled“wrappermethods”,(i.e.,theclassiﬁeristheevaluationfunction).Asthefeaturesareselectedusingtheclassiﬁerthatlateronusestheseselectedfeaturesinpredictingtheclasslabelsofunseeninstances,theaccuracylevelisveryhighalthoughcomputationallyquitecostly[21].Table1showsacomparisonofvariousevferentparametersusedforthecomparisonare:lity:howsuitableistheselectedsubsetfordifferentclassiﬁers(notjustforoneclassiﬁer);mplexity:timetakenforselectingthesubsetoffeatures;cy:‘–’inthelastcolumnmeansthatnothingcan‘classiﬁererrorrate,’theaccuracyofallotherevaluationfunctionsdependsonthedatasetandtheclassiﬁer(forclassiﬁcationafterfeatureselection),themoretimespent,thehighertheaccuracy).Thetablealsotellsuswhichmeasureshouldbeusedunderdifferentcircumstances,forexample,withtimeconstraints,givenseveralclassiﬁerstochoosefrom,classiﬁmeworkInthissubsection,westionproceduresandevaluationfunctionsareconsideredastwodimensions,andeachmethodis

,/IntelligentDataAnalysis1(1997)131–156Table2TwoDimensionalCategorizationofFeatureSelectionMethodsEvaluationMeasuresDistanceMeasureHeuristicI—Sec3.1ReliefRelief-FSege84GenerationCompleteII—Sec3.2BranchandBoundBFFBobr88V—Sec3.4MDLMVIIIXI—Sec3.6FocusSch193MIFES-1XIV—Sec3.8Ichi-Skla84aIchi-Skla84bAMB&BBSIIIRandom137InformationMeasureDependencyMeasureConsistencyMeasureClassiﬁerErrorRateIV—Sec3.3DTMKoll-Saha96VII—Sec3.5POE1ACCPRESETXXII—Sec3.8SBSSFSSBS-SLASHPQSSBDSMoor-Lee94RCQuei-Gels84VIIXXII—Sec3.7LVFXV—Sec3.8LVWGASARGSSRMHC-PF1groupeddnowledge,,wehavechosenatotalof32methodsfromtheliterature,andthenthesearegroupedaccordingtothecombinationofgenerationprocedureandevaluationfunctionused(seeTable2).Adistinctachievementofthisframeworkistheﬁndingofanumberofcombinationsofgenerationprocedureandevaluationfunction(theemptyboxesinthetable)thatdonotappearinanyexistingmethodyet(tothebestofourknowledge).Intheframework,acolumnstandsforatypignmentofevaluationfunctionswithinthecategoriesmaybeequivocal,becauseseveralevaluationfunctionsmaybeplacedindifferentcategorieswhenconsideringthemindifferentperspectives,andoneevaluationfunctionmaybeobtainedasamathematicaltransformationofanotherevaluationfunction[4].exttwosectionsweexplaineachcategoryandmethods,andchooseamethodineachcategory,foradetaileddiscussionusingpseudocodeandamini-dataset(Section3),andforanempiricalcomparison(Section4).,5typesofevaluationfunctionsand3typesofgenerationprocedures)combinationsofgenerationprocedures,section,wediscusseachcategory,bybrieﬂydescribingthemethodsunderit,andthen,ainitindetaihodsinthelastrowrepresentthe“wrapper”methods,wheretheevaluationfunctionisthe“classiﬁererrorrate”.Atypicalwrappermethodcanusedifferent

,/IntelligentDataAnalysis1(1997)131–156Table3SixteenInstancesofCorrAL#12345678A000000000A100001111B000110011B101010101I01000110C11101100Class00010001#916A011111111A100001111B000110011B101010101I11000010C10100100Class00011111kindsofclassiﬁersforevaluation;hencenorepresendtheyarediscussedbrieﬂtotypicaldatasetusedforhand-runoftherepresentativemethodsisshowninTable3whichconsistsof16instances(originally32instances)ni-datasethasbinaryclasses,andsixbooleanfeatures(A0,A1,B0,B1,I,C)wherefeatureIisirrelevant,featureCiscorrelatedtotheclasslabel75%ofthetime,andtheotherfourfeaturesarerelevanttothebooleantargetconcept:(A0QA1)∼(B0QB1).Inallthepseudocodes,Ddenotesthetrainingset,Sistheoriginalfeatureset,Nisthenumberoffeatures,Tistheselectedsubset,andMisthenumberofselected(orrequired)ryI:Generation—Heuristic,Evaluation—escriptionofVariousMethodsAsseenfromTable2,themostprominentmethodinthiscategoryisRelief[22].Weﬁrstdiscueatureweight-basedalgorithminspiredbyinstance-basedlearningalgorithms([1,8]).Fromthesetoftraininginstances,itﬁrstchoosesasampleofinstances,itﬁrstchoosesasampleofinstances;theusermustprovidethenumberofinstances(NoSample)randomlypicksthissampleofinstances,andforeachinstanceinitﬁtistheinstancehavingminimumEuclideandistanceamongallinstancesofthesameclassasthatofthechoseninstance;nearMissistheinstancetestheweightsofthefeaturesthatareinitializedtozerointhebeginningbasedonanintuitiveideathatafeatureismorerelevantifitdistinguishesbetweenaninstanceanditsnearMiss,xhaustingallinstancesinthesample,resholdcanbeautomaticallyevaluatedusingafunctionthatusesthenumberofinstancesinthesample;itcanalsobedeterminedbyinspection(allfeatureswithpositiveweightsareselected).Reliefworksfornoisyandcorrelatedfeatures,orlimitationisthatitdoesnothelpwithredundant

,/IntelligentDataAnalysis1(1997)131–esand,hence,gennbeovercomebyasubseqmitationisovercomebyRelief-F[25]ﬁrlimitationisthattheusermayﬁnddifﬁegen’s[44]methodusesanevaluationfunctionthatminimizesthesumofastatisticaldiscrepancymeasureandthefeaturecomplexitymeasure(inbits).Itﬁndstheﬁrstfeaturethatbestdistinguishestheclasses,anditerativelylooksforadditionalfeatureswhichincombi-RunofCorrALDataset(seeFigure2)ssumeinstance#,[0,1,0,0,0,1]ischosen).Itﬁndsthenferencebetweentwodichoseninstance#5,theinstance#1isthenearHit(Difference51),andinstance#13and#14arethenearMiss(difference52).Letuschooseinstance#,iteratedNoSampletimesspeciﬁy,theweightsarenegativeforirrelevantfeatures,rALdatasetisselects{A0,B0,B1,C}(moreaboutthisinSection4.3).ryII:Generation—Complete,Evaluation—escriptionofVariousMethodsThiscombinationisfoundinoldmethodssuchasbranchandbound[34].Othermethodsinthiscategoryarevariationsofbranchandbound(B&B)methodconsideringthegenerationprocedureused(BFF[50]),ortheevaluationfunctionused(Bobrowski’smethod[5]).Weﬁrstdiscussthebranchandboundmethod,followedbyabriefdiscussionoftheothertwomethods.

,/IntelligentDataAnalysis1(1997)131–156NarendraandFukunagahavedeﬁnedfeatureselectioninaclassicalway(seedeﬁnition2inSection1),asubsetoffeaturesshouldnotbebetterthananylargersetthatcontainsthesubset).Thisdeﬁnitionhasaseveredrawbackforreal-worldproblemsbecasdeﬁnitioncanbeslightlymodiﬁedtomakeitapplicableforgeneralproblemsaswell,bysayingthatB&Battemptstosatisfytwocriteria:(i)theselectedsubsetbeassmallaspossible;and(ii)aboundbeplacedonthevaluecalculatedbytheevaluationfunction[13].Asperthismodiﬁcation,B&Bstartssearchinvaluationfunctionobeysthemonotonicityprinciple,anysub,allsubsetsofitarediscardedfromthesearchspace).Evaluationfunctionsgenerallyusedare:Mahalanobisdistance[15],thediscriminantfunction,theFishercriterion,theBhattacharyadistance,andthedivergence[34].Xu,Yan,andChang[50]proposedasimilaralgorithm(BFF),wherethesearchprocedureismodiﬁedtosolvetheproblemofsearchinganoptimalpathinaweightedtreewiththeinformedbestﬁrstsearchstrategyinartiﬁgorithmguaranteesthebestglobalorsubsetwithoutexhaustiveenumeration,ski[5]provesthatthehomogeneitycoefﬁcientf∗Ikcanbeusedinmeasuringthedegreeoflineardependenceamongsomemeasurements,andshowsthatitisappl,ifS1,S2thenf∗Ik(S1)>f∗Ik(S2)).Hence,thiscanbesuitablyconvertedtoafeatureselectionmethodbyimplementingitasanevaluationfunctionforbranchandboundwithbacktrackingorabetterﬁtthatanevaluationfunctionmustbemonotonictobeapplicabletothesemethods,oblemispartiallysolvedbyrelaxingthemonotonicitycriterionandintroducingtheapproximatemonotonicityconcept[16].-runofCorrALDataset(seeFigure3)orithmneedsinputoftherequirednumberoffeatures(M)anditattemptstoﬁandbound.

,/IntelligentDataAnalysis1(1997)131–ontreemethod(DTM).nswiththefullsetoffeaturesS0,removesonefeaturefromSjˆ2inturntogeneratesubsetslll1lSj,wherelisthecurrentlevelandjspeciﬁ(Sj).U(Sjˆ2),Sjstopsgrowing(itsbranchispruned),otherwise,,heCorrALdataset,Missetto4,thensubset(A0,A1,B0,I)ryIV:Generation—Heuristic,Evaluation—escriptionofVariousMethodsWehavefoundtwomethodsunderthiscategory:decisiontreemethod(DTM)[9],andKollerandSahami’s[24]methodfeatureselectionisusedinanapplicationonnaturallanguageprocessing.C4.5[39]isrunoverthetrainingset,rwords,theunionofthesubsetsoffeatures,appearinginthepathstoanyleafnodeintheprunedtree,ondmethod,whichisveryrecent,isbasedontheintuitionthatanyfeature,givenlittleornoadditionalinformationbeyondthatsubsumedbytheremainingfeatures,iseitherirrelevantorredundant,izethis,KollerandSahamitrytoapproximatetheMarkovblanketwhereasubsetTisaMarkovblanketforfeaturefiif,givenT,fiisconditionallyindependentbothoftheclasslabelandofallfeaturesnotinT(includingfiitself).TheimplementationoftheMarkovblanketissuboptimalinmanyways,-RunofCorrALDataset(seeFigure3)C4.5usesaninformation-basedheuristic,asimpleformofwhichfortwoclassproblem(asourexample)is−ppnnI(p,n)=log2−log2,p+np+np+np+nwherepisthenumberofinstancthatusinganattributeF1astherootinthetreewillpartitionthetrainingsetintoT0andT1(becauseeachfeaturetakesbinaryvaluesonly).EntropyoffeatureF1isp0+n0p1+n1E(F1)=I(p0,n0)+I(p1,n1).p+np+nConsideringtheCorrALdataset,kesvalue‘0’,thenoneinstanceispositive(class51)andseveninstancesarenegative(class50,andforvalue‘1’,sixinstancesarepositive(class51)andtwoarenegative(class50).Hence

,/IntelligentDataAnalysis1(1997)131–1561+76+2I(1,7)+I(6,2)thisvalueisminimumamongallfeatures,ginaltrainingsetofsixteeninstancesisdividedintotwonodes,setwonodes,againfeatureshavingtheleastentropyareselected,ocesshaltswheneachpartitioncontainsinstancesofasingleclass,isiontreeconstructed,thusisprunedbasicallytoavoidover-ﬁ,A0,A1,B0,B1,C).E(C)=ryV:Generation—Complete,Evaluation—InformationUnderthiscategorywefoundMinimumDescriptionLengthMethod(MDLM)[45].Inthismethod,theauthorshaveattemptedtoeliminatealluseless(irrelevantand/orredundant)ingtotheauthors,ifthefeaturesinasubsetVcanbeexpressedasaﬁxednon-class-dependentfunctionFofthefeaturesinanothersubsetU,thenoncethevaluesofthefeaturesinsubsetUareknown,tureselection,UandVtogeethis,theauthorsusetheminimumdescriptionlengthcriterion(MDLC)introducedbyRissanen[40].Theyformulatedanexpressionthatcanbeinterpretedasthenumberofbitsrequiredtotransmittheclassesoftheinstances,theoptimalparameters,theusefulfeatures,and(ﬁnally)orithmexhaustivelysearchesallthepossiblesubsets(2N)thodcanﬁ-Gaussiancases,itmaynotbeabletoﬁndtheusefulfeatures(moreaboutthisinSection4.3).-RunofCorrALDatasetAsseeninFigure5,MDLMisbasicallytheevaluationofanequationthatatethecovariancematriceinethedeterminantsofthesub-matricesDL(i)andDLforequationasshowninFigure5,andﬁCorrALdatasetthechosenfeaturesubsetis{C}ryVII:Generation—Heuristic,Evaluation—escriptionofVariousMethodsWefoundPOE1ACC(ProbabilityofERROR&AverageCorrelationCoefﬁcient)method[33]andPRESET[31]asseventechniquesoffeatureselectionarepresentedinPOE1ACC,,seventh)method,method,theﬁrstfeatureselectedisthefeaturewiththesmallestprobabilityoferror(Pe).ThenextfeatureselectedisthefeaturethatproducestheminimumweightedsumofPeandaveragecorrelationcoefﬁcient(ACC),hemeanofthecorrelationcoefﬁcientsofthodcanrankallthefeaturesbasedon

,/IntelligentDataAnalysis1(1997)131–mdescriptionlengthmethod(MDLM).+ghtedsum,andarequirednumberoffeatures(M)usestheconceptofroughset;itﬁrstﬁ,areductRofasetPclassiﬁesinstancesequallywellasPdoes)ranksthefeaturesbasedontheirsigniﬁniﬁcanceofafeatureisameasureexpressinghowimportantafeatureisregardingclassiﬁ-RunofCorrALDataset(seeFigure6)TheﬁrstfeaturechosenisthefeaturehavingminimumPe,andineachiterationthereafter,thefeaturehavingminimumw1(Pe)1w2(ACC)xperiments,weconsiderw1tobe0.1andw2tobe0.9(authorssuggestthesevaluesfortheircasestudy).TocalculatePe,mini-dataset,itis9/16forclass0and7/,foreachfeature,ﬁndthatforclasslabel0,theclass-conditionalprobabilityoffeatureChavingthevalue0is2/9andthatofvalue1is7/9,andforclasslabel1,theclass-conditionalprobabilityoffeatureChavingthevalue0is6/7andthatofvalue1is1/

,/IntelligentDataAnalysis1(1997)131–,featureCtakingvalue0),weﬁndouttheclasslabelfortheproductofaprioriclassprobabilitatureCtakesthevalue0,thepredictionisclass1,andwhenfeatureCtakesvalue1,instances,wecounteChas3/16fractionofmismatches(Pe).Infact,amongallfeatures,ChastheleastPeandhence,isselectedastheﬁecondstep,correlationsofallremainingfeatures(A0,A1,B0,B1,I)heexpressionw1(Pe)1w2(ACC)weﬁndthatfeatureA0hastheleastvalueamongtheﬁ(requirednumber)is4,thensubset(C,A0,B0,I)ryXI:Generation—Complete,Evaluation—escriptionofVariousMeussFocus[2],andSchlimmer’s[43]method,andMIFES1[35],andselectFocusasthemplementstheMin-Featuresbiasthatprefersconsistenthypothesesdeﬁimplestimplementation,itdoesabreadth-ﬁrstsearchandchecklyspeaking,Focusisunabletohandlenoise,butasimplemodiﬁcationthatallowsacertainpercentageofinconsistencywillenableittoﬁmer’smethodusesasystematicenumerationschemeasgeneraturisticfunctionisareliabilitymeasure,basedontheintuitionthattheprobabilitythataninconsistencywillbeobserved,isproportionaltothepercentageofvaluesthaesentsthesetofinstancesintheformofamatrix1,eachelementofwhichstandsforauniquecombinationofapositiveinstance(class51)andanegativeinstance(class50).AfeaturefissaidtocoveranelementofthematrixifitassumesoppositevaluesfchesforacoverwithN21featuresstartingfromonewithallNfeatures,-RunofCorrALDataset(seeFigure7)AsFocususesbreadth-ﬁrstgenerationprocedure,itﬁ,{A0},{A1},{B0},{B1},{I},{C}),followedbyallsubsetsofsizetwo,hsubsetgenerated,thus,itcheckswhetherthereareatleasttwoinstancesinthedatasethavingeq,inconsistency).Ifsuchacasearises,itrejectsthesubsetsayinginconsistent,ntinuesuntilitﬁ,allpossiblesubsetsarefoundinconsistent).Forsubset{A0},instance#,0),,0and11MIFES1canonlyhandlebinaryclassesandbooleanfeatures.

,/IntelligentDataAnalysis1(1997)131–tively).Henceitrejectsitandmovesontothenextsubset{A1}.Itevaluatesatotalof41subsets

本文发布于:2024-09-21 21:57:02，感谢您对本站的认可！

本文链接：https://www.17tex.com/fanyi/14383.html

上一篇：advanced bios features怎么设置

下一篇：semantic features举例

标签：

留言与评论（共有 0 条评论）