Abstract Supervised fuzzy clustering for the identification of fuzzy classifiers

更新时间:2023-06-09 15:29:01 阅读量: 实用文档 文档下载

说明:文章内容仅供预览,部分内容可能不全。下载后的文档,内容与下面显示的完全一致。下载之前请确认下面内容是否您想要的,是否完整无缺。

The classical fuzzy classifier consists of rules each one describing one of the classes. In this paper a new fuzzy model structure is proposed where each rule can represent more than one classes with different probabilities. The obtained classifier can be

PatternRecognitionLetters24(2003)

2195–2207

Supervisedfuzzyclusteringfortheidenti cation

offuzzyclassi ers

JanosAbonyi*,FerencSzeifert

DepartmentofProcessEngineering,UniversityofVeszprem,P.O.Box158,H-8201Veszprem,Hungary

Received9July2001;receivedinrevisedform26August2002

Abstract

Theclassicalfuzzyclassi erconsistsofruleseachonedescribingoneoftheclasses.Inthispaperanewfuzzymodelstructureisproposedwhereeachrulecanrepresentmorethanoneclasseswithdi erentprobabilities.Theobtainedclassi ercanbeconsideredasanextensionofthequadraticBayesclassi erthatutilizesmixtureofmodelsforesti-matingtheclassconditionaldensities.Asupervisedclusteringalgorithmhasbeenworkedoutfortheidenti cationofthisfuzzymodel.Therelevantinputvariablesofthefuzzyclassi erhavebeenselectedbasedontheanalysisoftheclustersbyFisherÕsinterclassseparabilitycriteria.Thisnewapproachisappliedtothewell-knownwineandWisconsinbreastcancerclassi cationproblems.Ó2003ElsevierB.V.Allrightsreserved.

Keywords:Fuzzyclustering;Bayesclassi er;Rule-reduction;Transparencyandinterpretabilityoffuzzyclassi ers

1.Introduction

Typicalfuzzyclassi ersconsistofinterpretableif–thenruleswithfuzzyantecedentsandclasslabelsintheconsequentpart.Theantecedents(if-parts)oftherulespartitiontheinputspaceintoanumberoffuzzyregionsbyfuzzysets,whiletheconsequents(then-parts)describetheoutputoftheclassi erintheseregions.Fuzzylogicim-provesrule-basedclassi ersbyallowingtheuseof

Correspondingauthor.Tel.:+36-88-422-0224290;fax:+36-88-422-0224171.

E-mailaddress:abonyij@fmt.vein.hu(J.Abonyi).URL:http://www.fmt.vein.hu/softcomp.

*

overlappingclassde nitionsandimprovestheinterpretabilityoftheresultsbyprovidingmoreinsightintothedecisionmakingprocess.Fuzzylogic,however,isnotaguaranteeforinterpret-ability,aswasalsorecognizedin(ValentedeOliveira,1999;Setnesetal.,1998).Hence,reale ortmustbemadetokeeptheresultingrule-basetransparent.

Theautomaticdeterminationofcompactfuzzyclassi ersrulesfromdatahasbeenapproachedbyseveraldi erenttechniques:neuro-fuzzymethods(NauckandKruse,1999),genetic-algorithm(GA)-basedruleselection(Ishibuchietal.,1999),andfuzzyclusteringincombinationwithGA-optimi-zation(RoubosandSetnes,2000).Generally,thebottleneckofthedata-drivenidenti cationof

0167-8655/03/$-seefrontmatterÓ2003ElsevierB.V.Allrightsreserved.

doi:10.1016/S0167-8655(03)00047-3

The classical fuzzy classifier consists of rules each one describing one of the classes. In this paper a new fuzzy model structure is proposed where each rule can represent more than one classes with different probabilities. The obtained classifier can be

2196J.Abonyi,F.Szeifert/PatternRecognitionLetters24(2003)2195–2207

fuzzysystemsisthestructureidenti cationthatrequiresnon-linearoptimization.Thusforhigh-dimensionalproblems,theinitializationthefuzzymodelbecomesverysigni moninitial-izationsmethodssuchasgrid-typepartitioning(Ishibuchietal.,1999)andrulegenerationonextremainitialization,resultincomplexandnon-interpretableinitialmodelsandtherule-basesimpli cationandreductionstepsbecomecom-putationallydemanding.Toavoidtheseproblems,fuzzyclusteringalgorithms(SetnesandBabu ska,1999)wereputforward.However,theobtainedmembershipvalueshavetobeprojectedontotheinputvariablesandapproximatedbyparameter-izedmembershipfunctionsthatdeterioratestheperformanceoftheclassi er.Thisdecompositionerrorcanbereducedbyusingeigenvectorprojec-tion(Kimetal.,1998),buttheobtainedlinearlytransformedinputvariablesdonotallowthein-terpretationofthemodel.Toavoidtheprojectionerrorandmaintaintheinterpretabilityofthemodel,theproposedapproachisbasedontheGath–Geva(GG)clusteringalgorithm(GathandGeva,1989)insteadofthewidelyusedGustafson–Kessel(GK)algorithm(GustafsonandKessel,1979),becausethesimpli edversionofGGclus-teringallowsthedirectidenti cationoffuzzymodelswithexponentialmembershipfunctions(Hoppneretal.,1999).

NeitherGGnorGKalgorithmdoesnotutilizetheclasslabels.Hence,theygivesuboptimalresultiftheobtainedclustersaredirectlyusedtofor-mulateaclassicalfuzzyclassi er.Hence,thereisaneedfor ne-tuningofthemodel.ThisGAorgradient-based ne-tuning,however,canresultinover ttingandthuspoorgeneralizationoftheidenti edmodel.Unfortunately,theseverecom-putationalrequirementsoftheseapproacheslimittheirapplicabilityasarapidmodel-developmenttool.

Thispaperfocusesonthedesignofinterpret-ablefuzzyrule-basedclassi ersfromdatawithlow-humaninterventionandlow-computationalcomplexity.Hence,anewmodelingschemeisin-troducedbasedonlyonfuzzyclustering.Theproposedalgorithmusestheclasslabelofeachpointtoidentifytheoptimalsetofclustersthat

describethedata.Theobtainedclustersarethenusedtobuildafuzzyclassi er.

Thecontributionofthisapproachistwofold. Theclassicalfuzzyclassi erconsistsofruleseachonedescribingoneoftheCclasses.Inthispaperanewfuzzymodelstructureisproposedwheretheconsequentpartisde nedastheprobabilitiesthatagivenrulerepresentsthec1;...;cCclasses.Thenoveltyofthisnewmodelisthatonerulecanrepresentmorethanoneclasseswithdi erentprobabilities.

Classicalfuzzyclusteringalgorithmsareusedtoestimatethedistributionofthedata.Hence,theydonotutilizetheclasslabelofeachdatapointavailablefortheidenti cation.Further-more,theobtainedclusterscannotbedirectlyusedtobuildtheclassi er.Inthispaperanewclusterprototypeandtherelatedclusteringalgo-rithmhavebeenintroducedthatallowsthedi-rectsupervisedidenti cationoffuzzyclassi ers.Theproposedalgorithmissimilartothemulti-prototypeclassi ertechnique(Biemetal.,2001;RahmanandFairhurst,1997).Inthisapproach,eachclassisclusteredindependentlyfromtheotherclasses,andismodeledbyfewcomponents(Gaussianingeneral).Themaindi erenceofthisapproachisthateachclusterrepresentsdi erentclasses,andthenumberofclustersusedtoap-proximateagivenclasshavetobedeterminedmanually,whiletheproposedapproachdoesnotsu erfromtheseproblems.

Usingtoomanyinputvariablesmayresultindi cultiesinthepredictionandinterpretabilitycapabilitiesoftheclassi er.Hence,theselectionoftherelevantfeaturesisusuallynecessary.Gener-ally,thereisaverylargesetofpossiblefeaturestocomposefeaturevectorsofclassi ers.Asideallythetrainingsetsizeshouldincreaseexponentiallywiththefeaturevectorsize,itisdesiredtochooseaminimalsubsetamongit.Somegenerictipstochooseagoodfeaturesetincludethefactsthattheyshoulddiscriminateasmuchaspossiblethepatternclassesandtheyshouldnotbecorrelated/redundant.Therearetwobasicfeature-selectionapproaches:Theclosed-loopalgorithmsarebased

The classical fuzzy classifier consists of rules each one describing one of the classes. In this paper a new fuzzy model structure is proposed where each rule can represent more than one classes with different probabilities. The obtained classifier can be

J.Abonyi,F.Szeifert/PatternRecognitionLetters24(2003)2195–22072197

ontheclassi cationresults,whiletheopen-loopalgorithmsarebasedonadistancebetweenclus-ters.Intheformer,eachpossiblefeaturesubsetisusedtotrainandtotestaclassi er,andtherec-ognitionratesareusedasadecisioncriterion:thehighertherecognitionrate,thebetteristhefeaturesubset.Themaindisadvantageofthisapproachisthatchoosingaclassi erisacriticalproblemonitsown,andthatthe nalselectedsubsetclearlyde-pendsontheclassi er.Ontheotherhand,thelatterdependsonde ningadistancebetweentheclusters,andsomepossibilitiesareMahalanobis,Bhattacharyyaandtheclassseparationdistance(CamposandBloch,2001).

InthispapertheFisher-interclassseparabilitymethodisutilized,whichisanopen-loopfeatureselectionapproach(Ciosetal.,1998).Otherpa-persfocusedonfeatureselectionbasedonsimi-larityanalysisofthefuzzysets(CamposandBloch,2001;RoubosandSetnes,2000).Di er-encesinthesereductionmethodsare:(i)Featurereductionbasedonthesimilarityanalysisoffuzzysetsresultsinaclosed-loopfeatureselectionbe-causeitdependsontheactualmodelwhiletheappliedopen-loopfeatureselectioncanbeusedbeforehandasitisindependentfromthemodel.(ii)Insimilarityanalysis,afeaturecanberemovedfromindividualrules.Intheinterclassseparabilitymethodthefeatureisomittedinalltherules(Roubosetal.,2001).InthispaperthesimpleFisherinterclassseparabilitymethodhavebeenmodi ed,butinthefutureadvancedmulticlassdatareductionalgorithmslikeweightedpairwiseFishercriteria(Loogetal.,2001)couldbealsoused.

Thepaperisorganizedasfollows.InSection2,thestructureofthenewfuzzyclassi erispre-sented.Section3describesthedevelopedclus-teringalgorithmthatallowsforthedirectidenti cationoffuzzyclassi ers.FortheselectionoftheimportantfeaturesofthefuzzysystemaFisherinterclassseparabilitycriteriabasedmethodwillbepresentedinSection4.Thepro-posedapproachisstudiedfortheWisconsinbreastcancerandthewineclassi cationexamplesinSection5.Finally,theconclusionsaregiveninSection6.

2.Structureofthefuzzyrule-basedclassi er2.1.ClassicalBayesclassi er

Theidenti cationofaclassi ersystemmeanstheconstructionofamodelthatpredictstheclassyk¼fc1;...;cCgtowhichpatternxk¼½x1;k;...;xn;k shouldbeassigned.TheclassicapproachforthisproblemwithCclassesisbasedonBayesÕrule.Theprobabilityofmakinganerrorwhenclassi-fyinganexamplexisminimizedbyBayesÕdecisionruleofassigningittotheclasswiththelargestaposterioriprobability:

xisassignedtoci()pðcijxÞPpðcjjxÞ8j¼i

ð1Þ

TheaposterioriprobabilityofeachclassgivenapatternxcanbecalculatedbasedonthepðxjciÞclassconditionaldistribution,whichmodelsthedensityofthedatabelongingtotheclassci,andthePðciÞclassprior,whichrepresentstheprob-abilitythatanarbitraryexampleoutofdatabe-longstoclasscipðcpðxjciÞPðciÞpðxjciÞPðcijxÞ¼

pðxÞ¼iÞ

j¼1pðxjcjÞPðcð2Þ

As(1)canberewrittenusingthenumeratorof(2)

xisassignedtoci()pðxjciÞPðciÞPpðxjcjÞPðcjÞ

8j¼ið3Þwewouldhaveanoptimalclassi erifwewouldperfectlyestimatetheclasspriorsandtheclassconditionaldensities.

Inpracticeoneneedsto ndapproximateesti-matesofthesequantitiesona nitesetoftrainingdatafxk;ykg,k¼1;...;N.PriorsPðciÞareoftenestimatedonthebasisofthetrainingsetastheproportionofsamplesofclassciorusingpriorknowledge.ThepðcijxÞclassconditionaldensitiescanbemodeledwithnon-parametricmethodslikehistograms,nearest-neighborsorparametricmeth-odssuchasmixturemodels.

AspecialcaseofBayesclassi ersisthequad-raticclassi er,wherethepðxjciÞdistributiongen-eratedbytheclassciisrepresentedbyaGaussianfunction

The classical fuzzy classifier consists of rules each one describing one of the classes. In this paper a new fuzzy model structure is proposed where each rule can represent more than one classes with different probabilities. The obtained classifier can be

2198J.Abonyi,F.Szeifert/PatternRecognitionLetters24(2003)2195–2207

pðxjc1

1

iÞ¼j2pFðxÀviÞTðFiÞÀ1

ðxÀviÞij

expÀ2ð4Þ

wherevi¼½v1;i;...;vn;i T

denotesthecenterofthe

ithmultivariateGaussianandFistandsforaco-variancematrixofthedataoftheclassci.Inthiscase,the(3)classi cationrulecanbereformulatedbasedonadistancemeasure.Thesamplexkisclassi edtotheclassthatminimizestheD2distance,wherethedistancemeasureisinverselyi;kðxkÞproportionaltotheprobabilityofthedata:D2i;kð xkÞ

PðcÀ1

¼

1

!j2pFðxÀviÞTðFiÞÀ1ðxij

n=2expÀ2ÀviÞð5Þ

2.2.Classicalfuzzyclassi er

Theclassicalfuzzyrule-basedclassi erconsistsoffuzzyruleseachonedescribingoneoftheCclasses.Theruleantecedentde nestheoperatingregionoftheruleinthen-dimensionalfeaturespaceandtheruleconsequentisacrisp(non-fuzzy)classlabelfromthefc1;...;cCglabelset:ri:

Ifx1isAi;1ðx1;kÞand...xnisAi;nðxn;kÞthen^y

¼ci;½wi ð6Þ

whereAi;1;...;Ai;naretheantecedentfuzzysets

andwiisacertaintyfactorthatrepresentsthedesiredimpactoftherule.Thevalueofwiisusuallychosenbythedesignerofthefuzzysystemaccordingtohisorherbeliefintheaccuracyoftherule.Whensuchknowledgeisnotavailable,wiis xedtovalue1foranyi.

Theandconnectiveismodeledbytheproductoperatorallowingforinteractionbetweenthepropositionsintheantecedent.Hence,thedegreeofactivationoftheithruleiscalculatedas:

nbðxkÞ¼wiY

iAi;jðxj;kÞð7Þ

j¼1

Theoutputoftheclassicalfuzzyclassi erisdeterminedbythewinnertakesallstrategy,i.e.the

outputistheclassrelatedtotheconsequentofthe

rulethatgetsthehighestdegreeofactivation:^y

k¼ciÃ;iüargmaxbiðxkÞ

ð8Þ

16i6C

TorepresenttheAi;jðxj;kÞfuzzyset,weuse

Gaussianmembership functions

AðxexpÀ1ðxj;kÀvj;iÞ

2

!

i;jj;kÞ¼2rð9Þ

j;iwherevi;jrepresentsthecenterandr2thevarianceoftheGaussianfunction.j;istandsfor

TheuseofGaussianmembershipfunctionallowsforthecompactformulationof(7):

biðxkÞ¼wiAiðx¼w kÞ

1

iexpÀ2

ðxkÀviÞTðFiÞÀ1ðxkÀviÞ

ð10Þ

wherevi¼½v1;i;...;vn;i T

denotesthecenteroftheithmultivariateGaussianandFistandsforadia-gonalmatrixthatcontainsther2Thefuzzyclassi erde nedi;byjvariances.

thepreviousequationsisinfactaquadraticBayesclassi erwhenFiin(4)containsonlydiagonalelements(variances).(Formoredetails,refertothepaperofBaraldiandBlonda(1999),whichoverviewsthisissue.)

Inthiscase,theAiðxÞmembershipfunctionsandthewicertaintyfactorscanbecalculatedfromtheparametersoftheBayesclassi erfollowingEqs.(4)and(10)asAiðxÞ¼pðxjciÞj2pFij

n=2

;wi¼

PðciÞj2pFij

n=2

ð11Þ

2.3.Bayesclassi erbasedonmixtureofdensity

models

OneofthepossibleextensionsoftheclassicalquadraticBayesclassi eristousemixtureofmodelsforestimatingtheclass-conditionaldensi-ties.TheusageofmixturemodelsinBayesclassi- ersisnotsowidespread(Kambhatala,1996).Inthesesolutionseachconditionaldensityismodeledbyaseparatemixtureofmodels.Apossiblecriti-cismofsuchBayesclassi ersisthatinasensethey

The classical fuzzy classifier consists of rules each one describing one of the classes. In this paper a new fuzzy model structure is proposed where each rule can represent more than one classes with different probabilities. The obtained classifier can be

J.Abonyi,F.Szeifert/PatternRecognitionLetters24(2003)2195–22072199

aremodelingtoomuch:foreachclassmanyaspectsofthedataaremodeledwhichmayormaynotplayaroleindiscriminatingbetweentheclasses.

Inthispaperanewapproachispresented.ThepðcijxÞposterioridensitiesaremodeledbyR>Cmixtureofmodels(clusters)pðcijxÞXR¼

pðrljxÞPðcijrlÞð12Þ

l¼1

wherepðrljxÞrepresentstheaposterioriprobabil-ityofxhasbeengeneratedbytherlthlocalmodel

andPðcijrlÞdenotesthepriorprobabilityofthismodelrepresentstheclassci.

Similarlyto(2)pðrljxÞcanbewrittenaspðrpðxjriÞPðriÞijxÞ¼

pðxÞ¼pðxjriÞPðriÞ

j¼1pðxjrjÞPðrð13Þ

Byusingthismixtureofdensitymodelstheposterioriclassprobabilitycanbeexpressedfol-lowingEqs.(2),(12)and(13)aspðcðxjciÞPðciÞ

ijxÞ¼

ppðxÞ¼XRpðxjriÞPðriÞ

pðxjrPðcijrlÞPl¼1j¼1

jÞPðrjÞR

¼l¼1

pðxjriÞPðriÞPðcijrlÞ

pðxÞ

ð14Þ

TheBayesdecisionrulecanbethusformulatedsimilarlyto(3)as

xisassignedtoci

R()XpðxjrlÞPðrlÞPðcijrlÞ

l¼1

P

XRpðxjrlÞPðrlÞPðcjjrlÞ8j¼ið15Þ

l¼1

wherethepðxjrlÞdistributionisrepresentedbyGaussianssimilarlyto(4).2.4.Extendedfuzzyclassi er

AnewfuzzymodelthatisabletorepresentBayesclassi erde nedby(15)canbeobtained.Theideaistode netheconsequentofthefuzzyruleastheprobabilitiesofthegivenrulerepresentsthec1;...;cCclasses:

ri:

Ifx1isAi;1ðx1;kÞand...xnisAi;nðxn;kÞ

then^y

k¼c1withPðc1jriÞ;...;y^k¼cCwithPðcCjriÞ½wi

ð16Þ

SimilarlytoTakagi–Sugenofuzzymodels

(TakagiandSugeno,1985),therulesofthefuzzymodelareaggregatedusingthenormalizedfuzzymeanformulaandtheoutputoftheclassi erisdeterminedbythelabeloftheclassthathasthehighestactivation:

^y¼c¼argmaxPR

l¼1blðxkÞPðcijrlÞkiÃ;iÃ16i6Ci¼1blðxð17ÞkÞwhereblðxkÞhasthemeaningexpressedby(7).

Asthepreviousequationcanberewrittenusingonlyitsnumerator,theobtainedexpressionisidenticaltotheGaussianmixturesofBayesclas-si ers(15)whensimilarlyto(11)theparametersofthefuzzymodelarecalculatedasAiðxÞ¼pðxjrðriÞiÞj2pFij

n=2

;wi¼

Pj2pFij

ð18Þ

Themainadvantageofthepreviouslypresented

classi eristhatthefuzzymodelcanconsistofmorerulesthanclassesandeveryrulecandescribemorethanoneclass.Hence,asagivenclasswillbedescribedbyasetofrules,itshouldnotbeacompactgeometricalobject(hyper-ellipsoid).

Theaimoftheremainingpartofthepaperistoproposeanewclustering-basedtechniquefortheidenti cationofthefuzzyclassi erpresentedabove.Inaddition,anewmethodfortheselectionoftheantecedentvariables(features)ofthemodelwillbedescribed.

3.Supervisedfuzzyclustering

Theobjectiveofclusteringistopartitiontheidenti cationdataZintoRclusters.Thismeans,eachobservationconsistsofinputandoutputvariables,groupedintoarowvectorzk¼½xTk;yk ,wheretheksubscriptdenotesthek¼1;...;NthrowoftheZpatternmatrix.ThefuzzypartitionisrepresentedbytheU¼½li;k RÂNmatrix,wheretheli;kelementofthematrixrepresentsthedegreeof

The classical fuzzy classifier consists of rules each one describing one of the classes. In this paper a new fuzzy model structure is proposed where each rule can represent more than one classes with different probabilities. The obtained classifier can be

2200J.Abonyi,F.Szeifert/PatternRecognitionLetters24(2003)2195–2207

membership,howthezkobservationisintheclusteri¼1;...;R.

TheclusteringisbasedontheminimizationofthesumofweightedD2i;ksquareddistancesbetweenthedatapointsandthegiclusterprototypesthatcontainstheparametersoftheclusters.JðZ;XRNU;gÞ¼

Xðlm

i;kÞD2i;kðzk;riÞ

ð19Þ

i¼1

k¼1

wheremisafuzzyweightingexponentthatde-terminesthefuzzinessoftheresultingclusters.As

mapproachesonefromabove,thepartitionbe-comeshard(li;k2f0;1g),andviaretheordinarymeansoftheclusters.Asm!1,thepartitionbecomesfuzzy(li;k¼1=R)ually,misoftenchosenasm¼2.

Classicalfuzzyclusteringalgorithmsareusedtoestimatethedistributionofthedata.Hence,theydonotutilizetheclasslabelofeachdatapointavailablefortheidenti cation.Furthermore,theobtainedclusterscannotbedirectlyusedtobuildtheclassi er.Inthefollowinganewclusterpro-totypeandtherelateddistancemeasurewillbeintroducedthatallowsthedirectsupervisediden-ti cationoffuzzyclassi ers.Astheclustersareusedtoobtaintheparametersofthefuzzyclassi- er,thedistancemeasureisde nedsimilarlytothedistancemeasureoftheBayesclassi er(5):

1Yn 1ð2!

D2ðz¼Pðrxj;kÀvi;jÞ

i

ÞexpÀi;kk;riÞ2r2| {z }

j¼1

i;jGath–Gevaclustering

ÂPðcj¼ykjriÞð20Þ

Thisdistancemeasureconsistsoftwoterms.The rsttermisbasedonthegeometricaldistancebe-tweentheviclustercentersandthexkobservationvector,whilethesecondisbasedontheprobabilitythattherithclusterdescribesthedensityoftheclassofthekthdata,Pðcj¼ykjriÞ.Itisinterestingtonotethatthisdistancemeasureonlyslightlydi ersfromtheunsupervisedGGclusteringalgorithmwhichcanalsobeinterpretedinaprobabilisticframe-work(GathandGeva,1989).However,thenoveltyoftheproposedapproachisthesecondterm,whichallowstheuseofclasslabels.

Togetafuzzypartitioningspace,themember-shipvalueshavetosatisfythefollowingcondi-tions:

U2RcÂNjli;k2½0;1 8i;k;

XRNl8k;0<Xi;k¼1li;k<N

8ið21Þ

i¼1

k¼1

Theminimizationofthe(22)functionalrepre-sentsanon-linearoptimizationproblemthatissubjecttoconstraintsde nedby(21)andcanbesolvedbyusingavarietyofavailablemethods.Themostpopularmethod,isthealternatingoptimi-zation(AO),whichconsistsoftheapplicationofPicarditerationthroughthe rst-orderconditionsforthestationarypointsof(22),whichcanbefoundbyadjoiningtheconstraints(21)toJbymeansofLaGrangemultipliers(Hoppneretal.,1999),ðZ;U;g;kÞ¼

XRXNðli;kÞmD2i;kðzk;riÞi¼1

k¼1

XR!þ

XNkk

li;kÀ1

ð22Þ

k¼1

i¼1

andbysettingthegradientsofwithrespecttoZ,

U,gandktozero.

Hence,similarlytotheupdateequationsofGGclusteringalgorithm,thefollowingequationswillresultinasolutionthatsatis esthe(22)con-straints.

InitializationGivenasetofdataZspecifyR,chooseaterminationtolerance >0.InitializetheU¼½ldenotesi;k RÂNpartitionmatrixrandomly,whereli;kthemembershipthatthezkdataisgeneratedbytheithcluster.Repeatforl¼1;2;...

Step1Calculatetheparametersoftheclusters

Calculatethecentersandstandardde-viationoftheGaussianmembershipfunctions(thediagonalelementsoftheFicovariancematrices):

PNðlÀ1Þm

vðlÞ¼k¼1ðli;kÞxkiNðlÀ;k¼1

ðl1Þm

i;kÞð23r2;ðlÞPN¼1ðlðlÀ1Þi;k

Þmðxj;kÀvj;kÞ2

Þi;j¼kN

ðlÀ1Þm

k¼1ðli;kÞ

The classical fuzzy classifier consists of rules each one describing one of the classes. In this paper a new fuzzy model structure is proposed where each rule can represent more than one classes with different probabilities. The obtained classifier can be

J.Abonyi,F.Szeifert/PatternRecognitionLetters24(2003)2195–22072201

Estimatetheconsequentprobabilityparameters,

PðlÀ1Þm

pðckjyijrjÞ¼k¼ciðlj;kÞ

NðlÀ1Þ;k¼1

ðlm

j;kÞ16i6C;16j6R

ð24Þ

Aprioriprobabilityoftheclusterand

theweight(impact)oftherules:

Pðr1iÞ¼XNðlÀNðl1ÞÞm

i;k;

k¼1

nwi¼PðriÞYj¼112pr2ð25Þ

i;j

Step2ComputethedistancemeasureD2i;kðzk;riÞ

by(20).

Step3Updatethepartitionmatrix

lðlÞ

1

i;k¼R

1

ðDi;kðzk;riÞ=Dj;kðzk;rjÞÞ

2=ðmÀ1Þ

;

j¼16i6R;16k6N

ð26Þ

untilkUðlÞÀUðlÀ1Þk< .

Theremainderofthissectionisconcernedwiththetheoreticalconvergencepropertiesofthepro-posedalgorithm.Since,thisalgorithmisthememberofthefamilyofalgorithmsdiscussedin(HathawayandBezdek,1993),thefollowingdis-cussionisbasedontheresultsofHathawayandBezdek(1993).UsingLaGrangemultipliertheory,itiseasilyshownthatforD2 nesUðlþ1Þiminimizer;kðzk;riÞP0,(26)de-tobeaglobalofthere-strictedcostfunction(22).Fromthisitfollowsthattheproposediterativealgorithmisaspecialcaseofgroupedcoordinateminimization,andthegeneralconvergencetheoryfrom(Bezdeketal.,1987)canbeappliedforreasonablechoicesofD2i;kðzk;riÞtoshownthatanylimitpointofaniterationsequencewillbeaminimizer,oratworstasaddlepointofthecostfunctionJ.Thelocalconvergenceresultin(Bezdeketal.,1987)statesthatifthedistancemeasuresD2i;kðzk;riÞaresu cientlysmoothandastandardconvexityholdsataminimizerofJ,thenanyiterationsequencestartedwithUð0Þsu cientlyclosetoUÃwillcon-vergetotheminima.Furthermore,therateof

convergenceofthesequencewillbec-linear.This

meansthatthereisanormkÃkandconstants0<c<1andl0>0,suchthatforalllPl0,thesequenceoferrorsfelg¼fkðUlÞÀðUÃÞkgsatis- estheinequalityelþ1<cel.

4.FeatureselectionbasedoninterclassseparabilityUsingtoomanyinputvariablesmayresultindi cultiesintheinterpretabilitycapabilitiesoftheobtainedclassi er.Hence,selectionoftherelevantfeaturesisusuallynecessary.Othershavefocusedonreducingtheantecedentvariablesbysimilarityanalysisofthefuzzysets(RoubosandSetnes,2000),howeverthismethodisnotverysuitableforfeatureselection.InthissectionFischerinterclassseparabilitymethod(Ciosetal.,1998)ismodi edwhichisbasedonstatisticalpropertiesofthedata.TheinterclassseparabilitycriterionisbasedontheFBbetween-classandtheFWwithin-classcovari-ancematricesthatsumuptothetotalcovarianceofthedataFT¼FWþFB,whereXRFW¼PðrlÞFl;

l¼1FB¼XRPðrlÞðvlÀv0ÞTðvlÀv0Þ;

ð27Þ

l¼1v0¼

XRPðrlÞvl

l¼1

Thefeatureinterclassseperabilityselectioncri-terionisatrade-o betweenFWandFB:J¼

detFBdetFð28Þ

W

TheimportanceofafeatureismeasuredbyleavingouttheinterestedfeatureandcalculatingJforthereducedcovariancematrices.Thefeatureselectionisastep-wiseprocedure,whenineverysteptheleastneededfeatureisdeletedfromthemodel.

Inthecurrentimplementationofthealgorithmafterfuzzyclusteringandinitialmodelformulationagivennumberoffeaturesareselectedbycontin-uouslycheckingthedecreaseoftheperfor-manceoftheclassi er.Toincreasetheclassi cation

The classical fuzzy classifier consists of rules each one describing one of the classes. In this paper a new fuzzy model structure is proposed where each rule can represent more than one classes with different probabilities. The obtained classifier can be

2202J.Abonyi,F.Szeifert/PatternRecognitionLetters24(2003)2195–2207

performance,the nalclassi erisidenti edbasedonthere-clusteringofreduceddatawhichhavesmallerdimensionalitybecauseoftheneglectedinputvariables.

5.Performanceevaluation

Inordertoexaminetheperformanceoftheproposedidenti cationmethodtwowell-knownmultidimensionalclassi cationbenchmarkprob-lemsarepresentedinthissection.ThestudiedWisconsinbreastcancerandWinedatacomefromtheUCIRepositoryofMachineLearningData-bases(http://www.ics.uci.edu).

Theperformanceoftheobtainedclassi erswasmeasuredby10-foldcrossvalidation.Thedatadividedintotensub-setsofcasesthathavesimilarsizeandclassdistributions.Eachsub-setisleftoutonce,whiletheothernineareappliedfortheconstructionoftheclassi erwhichissubsequentlyvalidatedfortheunseencasesintheleft-outsub-set.

5.1.Example1:theWisconsinbreastcancerclas-si cationproblem

TheWisconsinbreastcancerdataiswidelyusedtotestthee ectivenessofclassi cationandruleextractionalgorithms.Theaimoftheclassi cationistodistinguishbetweenbenignandmalignantcancersbasedontheavailableninemeasurements:x1clumpthickness,x2uniformityofcellsize,x3uniformityofcellshape,x4marginaladhesion,x5singleepithelialcellsize,x6barenuclei,x7bland

chromatin,x8normalnuclei,andx9mitosis(datashowninFig.1).Theattributeshaveintegervalueintherange(BaraldiandBlonda,1999;Hoppneretal.,1999).Theoriginaldatabasecontains699instanceshowever16oftheseareomittedbecausetheseareincomplete,whichiscommonwithotherstudies.Theclassdistributionis65.5%benignand34.5%malignant,respectively.

TheadvancedversionofC4.5givesmisclas-si cationof5.26%on10-foldcrossvalidation(94.74%correctclassi cation)withtreesize25Æ0.5(Quinlan,1996).NauckandKruse(1999)combinedneuro-fuzzytechniqueswithinteractivestrategiesforrulepruningtoobtainafuzzyclassi- er.Aninitialrule-basewasmadebyapplyingtwosetsforeachinput,resultingin29¼512ruleswhichwasreducedto135bydeletingthenon- ringrules.Aheuristicdata-drivenlearningmethodwasappliedinsteadofgradientdescentlearning,whichisnotapplicablefortriangularmembershipfunctions.Semanticpropertiesweretakenintoac-countbyconstrainingthesearchspace.They nalfuzzyclassi ercouldbereducedtotworuleswith5–6featuresonly,withamisclassi cationof4.94%on10-foldvalidation(95.06%classi cationaccu-racy).Rule-generatingmethodsthatcombineGAandfuzzylogicwerealsoappliedtothisproblem~a-ReyesandSipper,2000).Inthismethodthe(Pen

numberofrulestobegeneratedneedstobedeter-minedapriori.Thismethodconstructsafuzzymodelthathasfourmembershipfunctionsandonerulewithanadditionalelsepart.Setiono(2000)hasgeneratedsimilarcompactclassi erbyatwo-stepruleextractionfromafeedforwardneuralnetworktrainedonpreprocesseddata.

The classical fuzzy classifier consists of rules each one describing one of the classes. In this paper a new fuzzy model structure is proposed where each rule can represent more than one classes with different probabilities. The obtained classifier can be

J.Abonyi,F.Szeifert/PatternRecognitionLetters24(2003)2195–22072203

AsTable1shows,ourfuzzyrule-basedclassi erisoneofthemostcompactmodelsintheliteraturewithsuchhighaccuracy.

Inthecurrentimplementationofthealgorithmafterfuzzyclusteringaninitialfuzzymodelisgeneratedthatutilizesallthenineinformationpro ledataaboutthepatient.Astep-wisefeaturereductionalgorithmhasbeenusedwhereineverysteponefeaturehasbeenremovedcontinuouslycheckingthedecreaseoftheperformanceoftheclassi eronthetrainingdata.Toincreasetheclassi cationperformance,theclassi erisre-identi edineverystepbyre-clusteringofreduceddatawhichhavesmallerdimensionalitybecauseoftheneglectedinputvariables.AsTable2shows,oursupervisedclusteringapproachgivesbetterresultsthanutilizingtheGGclusteringalgorithminthesameidenti cationscheme.

The10-foldvalidationexperimentwiththeproposedapproachshowed95.57%averageclas-si cationaccuracy,with90.00%astheworstand95.57%asthebestperformance.Thisisreallygoodforsuchasmallclassi erascomparedwithpreviouslyreportedresults(Table1).Astheerrorestimatesareeitherobtainedfrom10-foldcrossvalidationorfromtestingthesolutiononcebyusingthe50%ofthedataastrainingset,there-

sultsgiveninTable1areonlyroughlycompar-able.

5.2.Example2:thewineclassi cationproblemThewinedatacontainsthechemicalanalysisof178winesgrowninthesameregioninItalybutderivedfromthreedi erentcultivars.Theproblemistodistinguishthethreedi erenttypesbasedon13continuousattributesderivedfromchemicalanalysis(Fig.2).CorcoranandSen(1994)appliedallthe178samplesforlearning60non-fuzzyif–thenrulesinareal-codedgenetic-based-machinelearningapproach.Theyusedapopulationof1500individualsandapplied300generations,withfullreplacement,tocomeupwiththefollowingresultfor10independenttrials:bestclassi cationrate100%,averageclassi cationrate99.5%andworstclassi cationrate98.3%whichisthreemisclassi -cations.Ishibuchietal.(1999)appliedallthe178samplesdesigningafuzzyclassi erwith60fuzzyrulesbymeansofaninteger-codedgeneticalgo-rithmandgridpartitioning.Theirpopulationcontained100individualsandtheyapplied1000generations,withfullreplacement,tocomeupwiththefollowingresultfor10independenttrials:bestclassi cationrate99.4%(onemisclassi cation),

Table1

Classi cationratesandmodelcomplexityforclassi ersconstructedfortheWisconsinbreastcancerproblemAuthor

Setiono(2000)Setiono(2000)

~a-ReyesandSipper(2000)Pen

~a-ReyesandSipper(2000)Pen

NauckandKruse(1999)

Method

]Rules

]Conditions41141610–12

Accuracy(%)97.3698.197.0797.3695.06\

NeuroRule1f4NeuroRule2a3

Fuzzy-GA11Fuzzy-GA23NEFCLASS2

\Denotesresultsfromaveraginga10-foldvalidation.

Table2

Classi cationratesandmodelcomplexityforclassi ersconstructedfortheWisconsinbreastcancerproblemMethodGG:Sup:GG:Sup:

R¼2R¼2R¼4R¼4

MinAcc.84.2884.2888.5790.00

MeanAcc.90.9992.5695.1495.57

MaxAcc.

Min]Feat.

Mean]Feat.8.9898.7

Max]Feat.9999

95.71898.57798.57998.578

Resultsfroma10-foldvalidation.GG:Gath–Gevaclusteringbasedclassi er,Sup:proposedmethod.

The classical fuzzy classifier consists of rules each one describing one of the classes. In this paper a new fuzzy model structure is proposed where each rule can represent more than one classes with different probabilities. The obtained classifier can be

2204

Class

J.Abonyi,F.Szeifert/PatternRecognitionLetters24(2003)2195–2207

Alcohol

Magnesium

Color intensity

Hue

OD280/OD315

Proline

Tot. Phenols

Flavonoids

Malic acid

Nonflav.Phen.

Proanthoc.

Ash

Alcalinity ash

Fig.2.Winedata:threeclassesand13attributes.

averageclassi cationrate98.5%andworstclassi- cationrate97.8%(fourmisclassi cations).Inbothapproachesthe nalrulebasecontains60rules.Themaindi erenceisthenumberofmodelevaluationsthatwasnecessarytocometothe nalresult.

Firstly,forcomparisonpurposes,afuzzyclas-si er,thatutilizesallthe13informationpro ledataaboutthewinehasbeenidenti edbytheproposedclusteringalgorithmbasedonallthe178samples.Fuzzymodelswiththreeandsixruleswereidenti ed.Thethreerule-modelgaveonlytwomisclassi cation(correctpercentage98.9%).Whenaclusterwasaddedtoimprovetheperfor-Table3

Classi cationratesonthewinedatafor10independentrunsMethod

CorcoranandSen(1994)Ishibuchietal.(1999)GGclusteringSup(13features)Sup(13features)

Bestresult(%)10099.495.598.999.4

manceofthismodel,theobtainedclassi ergaveonlyonemisclassi cation(99.4%).Theclassi ca-tionpoweroftheidenti edmodelsisthencom-paredwithfuzzymodelswiththesamenumberofrulesobtainedbyGGclustering,asGGclusteringcanbeconsideredtheunsupervisedversionoftheproposedclusteringalgorithm.TheGGidenti edfuzzymodelachieveseightmisclassi cationscor-respondingtoacorrectpercentageof95.5%,whenthreerulesareusedinthefuzzymodel,whilesixmisclassi cations(correctpercentage96.6%)inthecaseoffourrules.TheresultsaresummarizedinTable3.Asitisshown,theperformanceoftheobtainedclassi ersarecomparabletothosein

Averageresult(%)99.598.595.598.999.4

Worstresult(%)98.397.895.598.999.4

Rules6060334

Modelevaluations15000060001

11

The classical fuzzy classifier consists of rules each one describing one of the classes. In this paper a new fuzzy model structure is proposed where each rule can represent more than one classes with different probabilities. The obtained classifier can be

J.Abonyi,F.Szeifert/PatternRecognitionLetters24(2003)2195–2207

Table4

Classi cationratesandmodelcomplexityforclassi ersconstructedfortheWineclassi cationproblemMethodGG:Sup:GG:Sup:GG:Sup:

R¼3R¼3R¼3R¼3R¼6R¼6

MinAcc.83.3388.8888.2376.4782.3588.23

MeanAcc.94.3897.7795.4994.8794.3497.15

MaxAcc.100100100100100100

Min]Feat.10124444

Mean]Feat.12.412.64.84.84.94.8

2205

Max]Feat.13135555

Resultsfromaveraginga10-foldvalidation.

(CorcoranandSen,1994;Ishibuchietal.,1999),butusefarlessrules(3–5comparedto60)andlessfeatures.

Theseresultsindicatethattheproposedclus-teringmethode ectivelyutilizestheclasslabels.AscanbeseenfromTable3,becauseofthesim-plicityoftheproposedclusteringalgorithm,theproposedapproachisattractiveincomparisonwithotheriterativeandoptimizationschemesthatinvolvesextensiveintermediateoptimizationtogeneratefuzzyclassi ers.

The10-foldvalidationisarigoroustestoftheclassi eridenti cationalgorithms.Theseexperi-mentsshowed97.77%averageclassi cationaccu-racy,with88.88%astheworstand100%asthebestperformance(Table4).Theabovepresentedautomaticmodelreductiontechniqueremovedonlyonefeaturewithoutthedecreaseoftheclas-si cationperformanceonthetrainingdata.Hence,toavoidpossiblelocalminima,thefeatureselec-tionalgorithmisusedtoselectonly vefeatures,andtheproposedschemehasbeenappliedagaintoidentifyamodelbasedontheselected veat-tributes.Thiscompactmodelwithaverage4.8rulesshowed97.15%averageclassi cationaccu-racy,with88.23%astheworstand100%asthebestperformance.Theresultedmembershipfunc-tionsandtheselectedfeaturesareshowninFig.3.

The classical fuzzy classifier consists of rules each one describing one of the classes. In this paper a new fuzzy model structure is proposed where each rule can represent more than one classes with different probabilities. The obtained classifier can be

2206J.Abonyi,F.Szeifert/PatternRecognitionLetters24(2003)2195–2207

ComparingthefuzzysetsinFig.3withthedatainFig.2showsthattheobtainedrulesarehighlyinterpretable.Forexample,theFlavonoidsaredividedinlow,mediumandhigh,whichisclearlyvisibleinthedata.

6.Conclusions

Inthispaperanewfuzzyclassi erhasbeenpresentedtorepresentBayesclassi ersde nedbymixtureofGaussiansdensitymodel.Thenoveltyofthisnewmodelisthateachrulecanrepresentmorethanoneclasseswithdi erentprobabilities.Fortheidenti cationofthefuzzyclassi erasu-pervisedclusteringmethodhasbeenworkedoutthatisthemodi cationoftheunsupervisedGGclusteringalgorithm.Inaddition,amethodfortheselectionoftherelevantinputvariableshasbeenpresented.Theproposedidenti cationapproachisdemonstratedbytheWisconsinbreastcancerandthewinebenchmarkclassi cationproblems.ThecomparisontoGGclusteringandGA-tunedfuzzyclassi ersindicatesthattheproposedsupervisedclusteringmethode ectivelyutilizestheclasslabelsandabletoidentifycompactandaccuratefuzzysystems.

Acknowledgements

ThisworkwassupportedbytheHungarianMinistryofEducation(FKFP-0073/2001)andtheHungarianScienceFoundation(OTKATO37600).PartoftheworkhasbeenelaboratedwhenJ.Ab-onyiwasattheControlLaboratoryofDelftUni-versityofTechnology.J.AbonyiisgratefulfortheJanosBolyaiFellowshipoftheHungarianAcad-emyofSciences.

References

Baraldi,A.,Blonda,P.,1999.Asurveyoffuzzyclustering

algorithmsforpatternrecognition––PartI.IEEETrans.SystemsManCybernet.PartB29(6),778–785.

Bezdek,J.C.,Hathaway,R.J.,Howard,R.E.,Wilson,C.A.,

Windham,M.P.,1987.Localconvergenceanalysisofa

groupedvariableversionofcoordinatedescent.J.Optimi-zationTheoryAppl.71,471–477.

Biem,A.,Katagiri,S.,McDermott,E.,Juang,B.H.,2001.An

applicationofdiscriminativefeatureextractionlo lter-bank-basedspeechrecognition.IEEETrans.SpeechAudioProcess.9(2),96–110.

Campos,T.E.,Bloch,I.,CesarJr.,R.M.,2001.Feature

selectionbasedonfuzzydistancesbetweenclusters:Firstresultsonsimulateddata.In:ICAPRÕ2001––InternationalConferenceonAdvancesinPatternRecognition,RiodeJaneiro,Brazil,May.In:LectureNotesinComputerScience.Springer-Verlag,Berlin.

Cios,K.J.,Pedrycz,W.,Swiniarski,R.W.,1998.DataMining

MethodsforKnowledgeDiscovery.KluwerAcademicPress,Boston.

Corcoran,A.L.,Sen,S.,ingreal-valuedgenetic

algorithmstoevolverulesetsforclassi cation.In:IEEE-CEC,June27–29,Orlando,USA.pp.120–124.

Gath,I.,Geva,A.B.,1989.Unsupervisedoptimalfuzzy

clustering.IEEETrans.PatternAnal.MachineIntell.7,773–781.

Gustafson,D.E.,Kessel,W.C.,1979.Fuzzyclusteringwitha

fuzzycovariancematrix.In:ProceedingsofIEEECDC,SanDiego,USA.

Hathaway,R.J.,Bezdek,J.C.,1993.Switchingregression

modelsandfuzzyclustering.IEEETrans.FuzzySystems1,195–204.

Hoppner,F.,Klawonn,F.,Kruse,R.,Runkler,T.,1999.Fuzzy

ClusterAnalysis––MethodsforClassi cation,DataAnaly-sisandImageRecognition.JohnWileyandSons,NewYork.

Ishibuchi,H.,Nakashima,T.,Murata,T.,1999.Performance

evaluationoffuzzyclassi ersystemsformultidimensionalpatternclassi cationproblems.IEEETrans.SMCB29,601–618.

Kambhatala,N.,1996.LocalmodelsandGaussianmixture

modelsforstatisticaldataprocessing.Ph.D.Thesis,OregonGradualInstituteofScienceandTechnology.

Kim,E.,Park,M.,Kim,S.,Park,M.,1998.Atransformed

input–domainapproachtofuzzymodeling.IEEETrans.FuzzySystems6,596–604.

Loog,L.C.M.,Duin,R.P.W.,Haeb-Umbach,R.,2001.

MulticlasslineardimensionreductionbyweightedpairwiseFishercriteria.IEEETrans.PAMI23(7),762–766.

Nauck,D.,Kruse,R.,1999.Obtaininginterpretablefuzzy

classi cationrulesfrommedicaldata.Arti cialIntell.Med.16,149–169.

Pe~n

a-Reyes,C.A.,Sipper,M.,2000.Afuzzygeneticapproachtobreastcancerdiagnosis.Arti cialIntell.Med.17,131–155.

Quinlan,J.R.,1996.Improveduseofcontinuousattributesin

C4.5.J.Arti cialIntell.Res.4,77–90.

Rahman,A.F.R.,Fairhurst,M.C.,1997.Multi-prototype

classi cation:improvedmodellingofthevariabilityofhandwrittendatausingstatisticalclusteringalgorithms.Electron.Lett.33(14),1208–1209.

The classical fuzzy classifier consists of rules each one describing one of the classes. In this paper a new fuzzy model structure is proposed where each rule can represent more than one classes with different probabilities. The obtained classifier can be

J.Abonyi,F.Szeifert/PatternRecognitionLetters24(2003)2195–2207

2207

Roubos,J.A.,Setnes,M.,pactfuzzymodelsthrough

complexityreductionandevolutionaryoptimization.In:FUZZ-IEEE,May7–10,SanAntonio,USA.pp.762–767.

Roubos,J.A.,Setnes,M.,Abonyi,J.,2001.Learningfuzzy

classi cationrulesfromdata.In:John,R.,Birkenhead,R.(Eds.),DevelopmentsinSoftComputing.Springer-Verlag,Berlin/Heidelberg,pp.108–115.

Setiono,R.,2000.Generatingconciseandaccurateclassi ca-tionrulesforbreastcancerdiagnosis.Arti cialIntell.Med.18,205–219.

Setnes,M.,Babu ska,R.,1999.Fuzzyrelationalclassi ertrainedbyfuzzyclustering.IEEETrans.SMCB29,619–625.

Setnes,M.,Babu ska,R.,Kaymak,U.,vanNautaLemke,H.R.,1998.Similaritymeasuresinfuzzyrulebasesimpli- cation.IEEETrans.SMCB28,376–386.

Takagi,T.,Sugeno,M.,1985.Fuzzyidenti cationofsystemsanditsapplicationtomodelingandcontrol.IEEETrans.SMC15,116–132.

ValentedeOliveira,J.,1999.Semanticconstraintsformem-bershipfunctionoptimization.IEEETrans.FS19,128–138.

本文来源:https://www.bwwdw.com/article/a2v1.html

Top