Deliverable D5.4 Mathematical Knowledge Management and searchability Project Acronym MKM-NE
(IST)
PROGRAMME
ProjectIST-2001-37057MKM-NET
DeliverableD5.4
MathematicalKnowledgeManagementandsearchability
I.Dahn(UniversityofKoblenz-Landau)A.Asperti(UniversityofBologna)
ProjectAcronym:MKM-NET
Proposal/Contractno.:IST-2001-37057MKM-NET
MKM-NET,IST-2001-37057Contents
1Introduction2FulltextSearch3Semi-textualsearch4SemanticSearchforTerms5Fastfilteringofthesearchspace6SemanticSearchforKnowledge7KeyPhraseSearch
8ExploitingSemanticRelations9
Summary
2
333455678
MKM-NET,IST-2001-370573
1Introduction
Inthisdocumentwedescribethestateoftheartforsearchingmathematicalcontent.WedonotdiscussgeneralstandardsfordocumentmarkupsuchasDublinCoreorZ39.50.
Mathematicalcontentisusuallysearchedbypeoplewhointendtoretrievespecificcontentforspecificpurposes.Ultimatelythesepurposesdeterminewhetheracertainretrieveditemisappropriateornot.Theultimategoalofsearchproceduresisnormallytoretrieveexactlythecontentthatisrequired.
Inthisrespectsearchingformathematicalcontentisnotdifferentfromsearchingforanyothercontent.But,themajorproblemishowtoanalyzetheavailablecontentandhowtodescribeitinaformatthatcanbeutilizedbysearchengines.
Intheremainderofthisdocumentwediscussandevaluateanumberofapproachestosearchingmathe-maticalcontentrangingfrombasictextualmethodsthroughtosemantics-driventechniques.
2FulltextSearch
Fulltextsearchisthesimplestmethod.Formathematicalcontentthismethodisespeciallylimitedsinceimportantpartsofthecontentconsistofformulas.Tobeofpracticaluse,itmustbeeasyfortheusertoenterthecontenttobesearchedfor.Butwhereformulasareinvolved,thisisnotstraightforward,althoughthereare
ATXshells–whichcanhelp,theyareyettobeintegratedintoafewtools–computeralgebrasystemsandLE
theinterfacesforsearchingmathematicalcontent.
ATX–thelanguagemostlyEveniftheuserwerecapableofenteringthedescriptionofaformulainLE
commonlyusedbyauthorsofmathematicaldocuments–thecorrespondingcontentmaynotbefoundsince
ATXmacro.theauthormayhaveexpressedtheformuladifferently,forexampleusinghisownuser-definedLEATX.AlsoforspecialcharacterslikeGermanumlautsthereareseveralpossibilitiesforannotationinLE
Inordertosupportfullyeventhisuncomfortableapproachtofulltextsearch,theencodingofformulasandspecialcharactersmustbestandardized,macrosmustbeexpandedandsearchrequestshavetobetranslatedaccordingly–forexampletranslatingumlautsintotheirstandardencoding.Theimplicationisthatitisnecessarytopreprocessdocumentsbeforetheycanbesearched.
Insomedocumentrepresentations,fulltextsearchforformulasbecomesimpossible,whenforexample
ATXtoHTML,suchformulasarerepresentedasimages.ThistechniqueisusedbypopularconvertersfromLE
asTeX4HtandLeTeX2HTML.
Althoughithasmanyinadequacies,fulltextsearchisnotentirelyuseless.Theresaonisthat,inmathe-maticaldocumentsknowledgeisfrequentlyexpressedasamixtureoftextandformulassuchas:
“IfAandBaregroups,thenA⊗B⊗Aisagrouptoo.”
Inthisexample,fulltextsearchcouldrevealthatthesentencetalksaboutgroupsbutitcouldnotreadilydetectthemoreimportantfactabouttheconstruction(denotedby⊗).
3Semi-textualsearch
Fulltextsearchbehavesbetterwhenthestructureoftheinformationisrelativelyrigid.Forinstance,greporsimilartoolshaveprovedtobeveryeffectiveintherealmofcomputerscience,wherethestructureoftextualdocuments(typicallyaprogram,ortheoutputofaprogram)ismuchmoreregularthanthatoftheeverydaymathematicallanguage.
AninterestingexperimentinbuildingasemanticbasedretrievalsystemintegratedwithgrephasbeenconductedaspartoftheMizarproject[2].Mizarisalong-termprojectaimedatbuildingacomprehensivelibraryofformalmathematicalknowledge(MizarMathematicalLibrary–MML)expressedbymeansofasuitable,semi-naturallanguage.Thegrammarofthislanguageissufficientlyflexibletoovercomemostoftheusualandannoyinglimitationsofformalmathematics;ontheotheritisstillsufficientlyrigidtoallowsignificanttextualqueries.Theproposedlanguageisbasedonawidesetofprimitiveoperations(returning
MKM-NET,IST-2001-370574
foreacharticlethesetofconceptsintroducedinit,orforeachmathematicalitemthesetofallotheritemswhicharereferencedinit,orwhichuseit,justtogiveafewexamples).Typicaloperationsoverlists,suchasunion,merge,filterandsoonallowtheconstructionofcomplexcompoundqueries.Themainfilteringoperationisstilltextual,directlycallinggrep.
TheworkonsemanticbasedretrievalinMMLhasonlyrecentlystarted.Theimportanceofsophisticatedtoolsforbrowsingandsearchinghasbecomeapparentinsomesub-projectswithinMizarwhichinvolvemanyauthorsandmanyarticles.Thecurrent,stillexperimental,implementationturnedouttobeveryhelpful,providingsearchingfunctionalitiesimpossiblewithapurelytextualsearch.Aproblemofthelanguage,atleastinitspresentversion,isthatitisclearlygearedtowardadvancedMizarusers.
4SemanticSearchforTerms
Unliketextfromotherfields,mathematicaltexthasastandardsemanticsagreedbymostpractitioners.Itisthereforepossibletoimprovethesearchpossibilitiesbytakingadvantageofthisstandardsemantics.OMDocspecifiesastandardwayforsupportingthisandMBase[9]isaspecificdatabasesupportingthesemanticretrievalofterms.
ThebasicideaofOMDocistospecifythewayinwhichmathematicaltextisaugmentedbyadescriptionofitsmeaning.Potentiallyitbecomesevenpossibletoomitthetextandtogenerateitondemandfromthesemanticdescription.Writingthesesemanticdescriptionsisnotaneasytaskandhassofarbeenaccom-plishedonlyforasmallsetofsampledocuments.Therequiredexpertiseisnotavailableformostworkingmathematiciansandsupportingtoolsarestillatanearlystageoftheirdevelopment.
However,onceestablished,theavailabilityofthesemanticannotationopensupanewclassofpossibilitiesforsearchingmathematicalcontent.Forexample,itbecomespossibletosearchforasymbol,say⊗,andtodistinguishbetweenitsuseasthedenotationforacartesianproductofgroupsfromitsuseforvectorproduct.
Notethatfortheusertheproblemofeasyqueryentryremainsandisfurthercomplicatedbytheneedtospecifythecontextinwhichthequerymaybesatisfied.Aparticularsituationwherethiscanbemadesimplerfortheuseriswhenthesearchcanberelatedtosomecurrentcontent,thatistodeterminethecontextualconstraintfromthequeryorigincontext.
Inthiscasetheusermaycopythesubjectofinterestintoasearchtemplateandthecopymechanismcanensurethatallrelevantsemanticinformationistransferredaswell.Whensearchisrestrictedtoselectedcontentobjectsonly,theseselectedobjectsmaybepresentedinadocumentaslinks.Aclickonsuchalinkcantheninitiateasearchforspecificrelatedcontent.TheActiveMathsystem,developedatDFKIinSaarbr¨ucken,usesthistechniquetotolinkeachoccurrenceofasymbolinadocumenttoitsdefinition.
Besidessearchingforwell-definedformulas,addingsemanticannotationoffersthepossibilityofretrievinginstancesoftemplates.Forexampleasearchfor“X⊗Y⊗X”,whereXandYarevariables(wildcards)inthetemplate,willretrieveA⊗B⊗AaswellasA⊗(B⊗C)⊗AandB⊗A⊗BbutnotA⊗B⊗C.
Notehoweverthattheexactbehaviourofthissemantictemplatesearchdependsonthestandardizationofthesearchsemanticsandthattheuserhastobeawareofit.ForexampleifthestandardsemanticsofX⊗Y⊗Zis(X⊗Y)⊗Z,asearchfor“X⊗Y⊗X”willfind(A⊗B)⊗AbutnotA⊗(B⊗A)thoughbothexpressionsaresemanticallyequivalentinthecasethat⊗isassociative.Totakeaccountofthis,itwouldbenecessaryeithertoaddalsoallsemanticallyequivalentannotations(whichisinfeasiblebecausetheremightbeinfinitelymanyequivalentforms)orthesearchenginehastotestsimilarlocatedcontenton-the-flyforsemanticequivalence.Theintegrationofacomputeralgebrasystemintoasearchenginemayprovideawaytorealizethis,butresourcedemandswillbeconsiderable.
Itmayinfactevendependonthepurposeofthesearchwhetheritisdesirabletoretrievesemanticallyequivalentcontent.Forabeginnerwhostillhastolearnthattheoperation⊗iscommutativeitmaynotbedesirabletoretrieveB⊗A2whensearchingfor“X⊗Y⊗X”whileanexperienceduserwouldwelcomeitifthesearchengineallowsforseveralvariantsofsearchrequests.Currentlythereisnotenoughpracticalexperiencetodecidehowauserinterfaceshouldbedesignedtomakethelargevarietyofpossiblesearchvariantsenabledbysemanticsearchtotheuser.
MKM-NET,IST-2001-370575
5Fastfilteringofthesearchspace
Itisnotuncommontorequireunificationinordertoprocessthesortofsemanticqueriesinwhichweareinterested.Assuch,wearemakingdemandswhichconventionaldatabasequerylanguagescannotmeet.Aworkaroundisthereforetoapplyaunificationprocessexternallyonasetofcandidatematchesanditbecomesessentialtobeabletofilterthedatacomingoutoftherepositoryveryrapidlyinordertoproduceareasonablysmallsetofcandidates.
Tosolvethisproblem,theHELMgroupinBolognahasproposedanapproach[6]basedonfourdistinctphases:
•Data-mining:asmallsetofmetadataisautomaticallycomputedfromeachtheoremordefinitioninthelibrary,providinganapproximationofitsactualcontent.
•Patterncompilation:whentheusersubmitsasemanticquery,thisistranslatedintoasetofconstraintsoverthemetadataandthencompiledintoalow-levelquery.
•Filtering:thelow-levelqueryisexecutedoverthedatabaseofmetadata,obtainingasetofcandidates.•Matching:theactualsemanticqueryisiteratedoverallcandidates,obtainingapreciseresultset.
Ifthefilteringoperationisbothquickandcorrect(inthesensethatnogoodcandidatesaredropped),thepreviousapproachachievesbothaccuracyandperformance.Moreover,itisextremelymodular:extendingthemetadatasetweimprovetheaccuracyoftheapproximation,andaretypicallyabletocopewithmorecomplexsemanticqueries.
TheMetadatausedbytheHELMgroupareessentiallyrelationsexpressingthatagivenmathematicalitemreferstoanotheritematagivenposition.Moreover,anextremelysimplesetofcrucialpositiondescriptorshasbeenidentified,coveringaverylargesetofqueriesofcommoninterest.
Atypicalunificationqueryistranslatedintoatwosetsofconstraints,providingalowerandanupperboundforthecandidates.Inparticular,allcandidatesmustbemoreinstantiatedthanwhatisrequiredtosatisfythesocalledmustconstraints,butlessinstantiatedthanrequiredtomeettheonlycontraints.Theprecisegenerationofmustandonlyconstraintsdependsontheparticularformofunificationinwhichtheuserisinterested.
Themanagementofmustandonlyconstraintstypicallyrequiressetcomparisonoperationsthatarenotavailableintraditionalrelationalquerylanguages.Forthisreason,anew(RDF-based)querylanguagehasalsobeendesignedandimplemented[5,7].
AnextensivestatisticalanalysishasbeenperformedoverthelibraryoftheCoqProofAssistant[1],inordertovalidatethemetadatamodel,tomeasureitsscalabilityandtotestitsdiscriminatorypower.
Theexperimentshavebeenquitesuccessful,andtheapproachlooksextremelypromising,butitremainstobeseenwhethertheapproachcanbeeffectivelyappliedinthecontextofautomatictheoremproving.
6SemanticSearchforKnowledge
Itisnotonlyformulas(terms)butalsomanynaturallanguagesentencesinmathematicaldocumentsthathaveaprecisemeaning.Letusconsideragainthesentence“ifAandBaregroups,thenA⊗B⊗Aisagrouptoo”.Thiscanformalizedas
group(A)∧group(B)⇒group(A⊗B⊗A).
ThussearchingforthetemplateX⇒group(Y⊗Z⊗U)whereX,Y,Z,UarevariablesinthetemplatewouldyieldthesamplesentenceinthedatabasethatAandBbeinggroupsisasufficientconditionforA⊗B⊗Abeingagroup.
But,thissearchwouldfailifthesemanticsofthesentencewereannotatedintheequivalentform“IfAisagroupandA⊗B⊗Aisnotagroup,thenBisnotagroup”
group(A)∧¬group(A⊗B⊗A)⇒¬group(B).
Atraditional(butexpensive)approach[4]tothisproblemistoidentifystatementsuptoisomorphisms(thatis,essentially,statementshavingasame,suitablenormalform).Thetheoryoftypeisomorphismisvery
MKM-NET,IST-2001-370576
interestingandquiteentangled,butunfortunatelyalsoquiteexpensivetoimplement.Moreover,ingeneral,itisnotexpressiveenoughtocaptureotherinterestingcases,suchasthesimpleexpansionofadefinition:justconsiderforinstancethetwologicallyequivalentsentences(a)xispositiveand(b)x>0.
In1997theILFsystemdevelopedattheHumboldtUniversityBerlinwithintheDFGfocusprogramme“Deduction”wasusedtohandlethisprobleminthefollowingway.
Firstofalltheobjectiveofthesearchwasreinterpreted.Itwasnolongerthegoaltofindoutwhetheracertainsentencecouldberetrievedfromtheknowledgebaseexactlyasentered.Ratherthegoalbecametofindoutwhethertheformulaenteredwasasemanticconsequenceoftheknowledgethatwasinthedatabase.
Todecidethis,firstthesentenceentered1bytheuserwasanalysedtodeterminethesymbolsitcontained.Secondly,anordinarydatabaserequestwasmadetoretrieveallcontentusingonlythesesymbolsandpossiblyarelatedsetofauxiliarysymbolssuchastheequalitysymbol.Thirdly,twoautomatictheoremproverswerelaunchedincompetitionwithoneanothergiventhegoalofprovingthephraseenteredbytheuserfromtheretrievedformulasinalimitedtime(30secondsinthisexperiment).
Ifoneoftheproverswassuccessful,thegeneratedproofwasanalyzedandallretrievedsentencesthathadactuallybeenusedintheproofoftheconjectureoftheuserwerereturned.Inthiscasetheuserreceivedtheinformationthattheirconjecturecanbeeasilyprovedfromtheseretrievedformulas.Ofcourse,thespecialcasethattheformulaenteredwasexactlylikeoralogicalvariantofoneinthedatabaseishandledbythisprocedureaswell.
Ifnoprovergeneratedaproofinthelimitedavailabletime,theuserwasinformedthatthesearchwasnotsuccessfulbutthatseveralrelatedformulaswerefoundinthedatabasesothattheusercouldexaminethemandattempttoverifytheconjecture.
Thepurposeofusingtwotheoremproversintheseexperimentswasjusttoevaluatethepowerofdifferentproversforthesesearchtasks.Theexperimentsusedthestrongestfirstordertheoremproversavailableatthattimeonadatabaseof100formalizedtheoremsfromtheMizarlibrary.Thoughthesearchwashighlyflexibleandfairlytoleranttovariationsinuserinputitwasapparentthatitwouldnotscaleupwithcurrentautomatedprovers.Themainreasonforthisisthatcurrentprovers,beingtailoredforfirstorderlogic,donotmakeefficientuseofthetypeinformationcontainedintheformalizedknowledge.Anotherproblemisthatthesearchspacegrowsexponentiallywiththenumberofcandidateformulasthatcanbeusedintheproof.Heremorerefinedpreprocessingandsemanticfilteringtechniqueswillhavetobedeveloped.Alsomodifyingthetasksothatonlyasingleformulaequivalenttotheuser’sconjecturehastobefoundcouldprovidehigherefficiencybyparallelizingtheresultingprooftasks.
Itshouldbealsonotedthatbyusingfirstorderproversitisnotpossibletosearchagainstatemplatewhereoneofthevariablesstandsforacompletesub-formulaandisthusahigherordervariable.Therearehigherorderproverswhocanhandlesuchrequests,butthesearchspacetheyhavetohandlewillbestillmorecomplex.
7KeyPhraseSearch
Allsearchproceduresdiscussedsofarcheckonlyfortheoccurrenceofsomemathematicalcontent.Theyarenotabletoevaluatetherelevanceoftheoccurrencesfound.Forexampleausersearchingforthecartesianproductsymbol⊗mayfindhundredsofplaceswherethissymbolisusedasanauxiliaryoperator,thoughhewaspossiblyonlyinterestedinseeingitsdefinitionanditsmostbasicproperties.
Currentlythereisnoconvincingwaytofindoutautomaticallywhatacertainpieceofcontentisactuallydescribing.Thisisinpartduetotheproblemsofanalyzingformulaswhichhavebeendiscussedabove.
Howeverithappensfrequentlythattheconceptsthatareexplainedbyapieceofcontentsimplydonotoccurintheactualtext.Thisphenomenoniswellknownalsofromnon-mathematicaltextswherekeyphrasesfromanindex,characterizingwhatatextisabout,donotnecessarilyoccurinthetextitself.
MKM-NET,IST-2001-370577
Automatedindexingtechniquestrytosolvethisproblembyanalyzingthetextandcomparingthewordsfoundwiththewordsinotherdocumentsthathavebeenmanuallyindexedbeforehand.WeakenrichedthesauriasdevelopedatCWIinAmsterdamareanexampleofthesemethods.Herekeyphrasesareassignedbasedontheoccurrenceofspecificgroupsofwords.Othermethodsusestatisticaldata.
Allthesemethodssharetheproblemthatimportantinformationisnotcontainedinthetextbutintheformulaswhichcannotbeanalyzedthisway.Theyfallshortofdeliveringresultsthatareusefulfortheenduser.[8]givesasurveyofthecurrentstateoftheapplicationofweakenrichedthesauriformathematicaldocumentsandderivesdirectionsforfurtherresearchfromthecurrentstate.Evenifautomatedindexingtechniquesarecurrentlynotusablestandaloneformathematicaldocuments,theymayprovidehintsformanualindexingbypointingtheindexertopotentiallyrelevantkeyphrasesfromanexistingthesaurus.
Mathematicshastheadvantageofpossessingastandardsubjectclassification,MSC2000,thatisgenerallyagreedbythemathematicalcommunity.Thisclassification,however,describesonlyratherlargesubtopicsofMathematics.Thereforeitcanbeusedtoclassifydocumentsbutitisinappropriatetoretrievecontentfromwithinthesedocuments.
BuildingandmaintainingastandardthesaurusforMathematicsthatislinkedwiththeMathematicsSub-jectClassificationisanurgentneed.WithintheISTprojectTrial-Solution,FIZKarlsruhehasstartedworktowardsthisbasedonkeyphrasesassignedbyauthorsof19mathematicalbooksandbyadditionalkeyphrasesassignedfromwithintheproject–thislatterworkismainlydoneattheMathematicalInstituteoftheTech-nicalUniversityinChemnitz.Theprojectthesaurushasca.6,000keyphrasesandFIZprovidessupportforaddingnewkeyphrasestothethesaurusinaconsistentway.
Keyphrasesearchcanbemorerefinedwhenitiscombinedwithsearchforspecifictypesofinformation.Forexampleinordertoprepareanexam,thetutorwillsearchforexercisesonatopicwherethetopicitselfischaracterizedbyacollectionofkeyphrases.
Keyphrasesarealsoasimplemethodtofindcontentfromotherdocumentsthatisrelatedtogivenpiecesofcontent.Whenamultilingualthesaurusisavailableitisalsopossiblethatausersearchesforkeyphrasesinhisnativelanguagebutretrievesalsocontentinotherlanguageswhichshemayunderstandevenwithoutknowingtheappropriatesearchphrasesinthatlanguage.
Keyphrasesaremetadatathatdescribecontent.Whenmathematicalknowledgeistoberetrieveditis,however,notonlythecontentthatmatters.Whenitistobeusedforteachingitisalsousefultosearchforcontentthathasbeenpreparedforaspecificaudienceorcontentforwhichlearningtakesacertainamountoftime.Alsotechnicalrestrictions(likeMimetypes)mightbeessential.Itmaybedesirabletosearchforapplicationsorformotivationsforacertainconcept.
StandardlearningobjectsmetadataschemaslikethosefromIEEEandIMSprovidethemeanstoencodethisinformationinastandardizedway.Thehigheffortinassigningthiskindofmetadatacontrastssharplywithitsdecreasedvaluewhenthecontentisreusedinadifferentcontext.Forexamplewhatwasintendedtobeamotivatingexamplewhichcanberunthroughbyaphysicsstudentwithlittleeffortmightbecomeanapplicationforamathematicsstudentthatrequiresconsiderableeffortstofillthegapsleftinthetext.Thesamemathematicsstudentwillhoweverhavenoproblemsunderstandingthetextwhenhereturnstoitlaterforasecondview.
8ExploitingSemanticRelations
Themethodsdiscussedsofarretrievemathematicalcontentbasedonfeaturesassignedtoindividualcontentpieces.Thisisnotenoughwhensmallpiecesofcontentaretoberetrievedforreuse.Piecesofcontentdonotliveinisolation.Theymakesenseonlyinanenvironmentwhichprovidesthenecessarycontextorprerequisites:Theoremshaveproofswhichmakeuseofothertheorems;exercisesrequireunderstandingofprevioustheoremsetc.
Modellingthisnetworkofinterrelationsgivesthepossibilityofcomposingmeaningfuldocumentsforspe-cificpurposes.SlicingBookTechnology,asinventedattheUniversityKoblenz-LandauandfurtherdevelopedintheISTprojectTrial-Solution,hasappliedthistoasmuchas5,000pagesofpre-existingmathematical
MKM-NET,IST-2001-370578
texts.
Ithasturnedoutthatthemostrelevantsemanticrelationsbetweenpiecesofcontentarebinary,thatistheyrelateonlytwopiecesofcontent.[3]describestherelationswhichhavebeenfoundtoberelevantinthatproject.Themostimportantofthemis“AreferencesB”withtheintendedmeaning“InordertounderstandA,Bmustbeunderstood”.
Theserelationsaswellaskeyphrasesandtypesofcontentthathavebeendiscussedbeforecanbecom-binedwithinformationabouttheuserfromausermodel.Thiscanbedonebyanautomatedinferenceengine.Thisinferenceenginemustalsoprocessa(small)setofgeneralruleswhichdescribethesetofcontentslicesthatshouldberetrievedforauserwithacertainobjectiveandacertainsortofuserprofile.Basedonthisinformationtheinferenceenginewillderiverecommendationsforcontenttobepresented.Theserecommen-dationscanbefurthermodifiedbytheuserortheycanbeusedtolaunchthedynamiccreationofanewdocumentfromtheselectedslicesonthefly.
Inthiswaysearchisnotinitiatedbyenteringaconcepttobesearchedfor(thoughthisaswellasfulltextsearchispossible)butbyarequesttoadaptaselectionofinterestingslicesautomaticallyforaspecificusecase.Alsothefinalresultofthesearchisnotacollectionofoccurrencesbutanewdocumentwiththeretrievedcontentthatdidnotexistbefore.
Therequiredbinaryrelationsbetweencontentslicescanonlyrarelybeassignedautomatically.Explicitreferencescanbefoundinthetextandusedforthispurpose,butusuallythereareonlyfewsuchreferences.Consequently,theymustbeassignedmanually.
ExperimentsintheTrial-Solutionprojectshowthatthiscanbedonebyexperiencedstudentsatanaveragespeedof15slicesperhour,includingtheassignmentofkeyphrasesandtypes.
Itisimportanttonotethatthiseffortisnotdoneforaparticularusergrouporusagescenario:whenreuseforanewpurposeorinanewcontextisrequired,itisonlynecessarytoadaptthegeneralrulesthatdescribetheintendedsearchresults.Therelationsandkeyphrasesassignedtothecontentslicesdonothavetoberevisedoradapted.
DemosandproductsusingthistechnologyhavebeenproducedbySlicingInformationTechnologyandareavailablethroughthecompany’swebsiteathttp://www.slicing-infotech.de.
9Summary
Themostwidelyusedmethodstosearchformathematicalcontentarecurrentlyfulltextsearchandsearchforkeyphrases.Usersareusedtothisandappreciatetheadditionalhelpthatonlineeditorsofferforthesetypesofsearch.SlicingBookTechnologyasexplainedintheprevioussectionisafurtherstepforward.Itcanbegraduallyintroducedasanextensionoftheuseofkeyphrases.Itdoesnotrequireanychangeofthecontentitself.Semanticsearchprovidesthetoolsforaprecisesearchformathematicalobjects,toexchangethesewithothersystemsandtoevaluatethem,butthistypeofsearchtypicallyrequiresrewritingthecontentcompletely.Complexsemanticsearchhastodealwiththeproblemofdealingwithsemanticallyequivalentobjects.Theirhandlingrequiresfurtherresearchandtoolswhicharemostlikelytoarisefromcomputeralgebrasystemsandautomatedtheoremprovers.Acombinationwithkeyphrasesearchmayadditionallyhelpsolvingtheproblemofensuringtherelevanceofthemathematicalobjectsretrieved.
References
[1]A.Asperti,F.Guidi,L.Padovani,C.Sacerdoti.SomestatisticalconsiderationsontheCoqlibrary.Draft,
UniversityofBologna.2003.[2]GrzegorzBancerek,P.Rudnicki.InformationRetrievalinMML.ProceedingsofthesecondInternational
ConferenceonMathematicalKnowledgeManagement,Bertinoro,Italy,February2003.LNCS2594,pp.119–131.
MKM-NET,IST-2001-370579
[3]I.Dahn.ManagementofInformalMathematicalKnowledge-LessonsLearnedfromtheTrial-Solution
Project.In:FengshaiBaiandBerndWegner(eds.):ElectronicInformationandCommunicationinMath-ematics,LNCSvol.2730,pp.29–43,2003[4]D.Delahaye.informationRetrievalinacoqprooflibraryusingTypeisomorphisms.InT.Coquandetal.
editors,TYPES,LNCS1956,pp.131-147,200.[5]F.Guidi.SearchingandRetrievinginContent-BasedRepositoriesofFormalMathematicalKnowledge,
Ph.D.ThesisinComputerScience,TechnicalReportUBLCS2003-06,UniversityofBologna,March2003.[6]F.Guidi,C.SacerdotiCoen.QueryingDistributedDigitalLibrariesofMathematics.InCalculemus
2003,AracneEditriceS.R.L.,Ther´eseHardinandRenaudRiobooeditors,ISBN88-7999-545-6,pp.43–57.[7]F.Guidi,I.Schena.AQueryLanguageforaMetadataFrameworkaboutMathematicalResources.Pro-ceedingsofthesecondInternationalConferenceonMathematicalKnowledgeManagement,Bertinoro,Italy,February2003.LNCS2594,pp.105–118.[8]M.Hazewinkel.Identificationcloudsandautomaticassignmentofkeyphrases:lessonslearnedfromthe
TRIALSOLUTIONproject.Internalpaper,Trial-Solution2003.[9]M.Kohlhase,A.Franke.MBase:RepresentingKnowledgeandContentfortheIntegrationofMathe-maticalSoftwareSystems.J.SymbolicComputation,32(4),2001,pp365–402.
因篇幅问题不能全部显示,请点此查看更多更全内容