CompressedSensing
DavidL.Donoho,Member,IEEE
Abstract—Supposeisanunknownvectorin(adigitalimageorsignal);weplantomeasuregenerallinearfunctionalsofandthenreconstruct.Ifisknowntobecompressiblebytransformcodingwithaknowntransform,andwerecon-structviathenonlinearproceduredefinedhere,thenumberofmeasurementscanbedramaticallysmallerthanthesize.Thus,certainnaturalclassesofimageswithpixelsneedonly
5214
=(log())nonadaptivenonpixelsamplesforfaithfulrecovery,asopposedtotheusualpixelsamples.Morespecifically,supposehasasparserepresentationinsomeorthonormalbasis(e.g.,wavelet,Fourier)ortightframe(e.g.,curvelet,Gabor)—sothecoefficientsbelongtoanballfor01.Themostimportantcoefficientsinthatexpansionallowreconstructionwith2error(121).Itispossibletodesign=(log())nonadaptivemeasurementsallowingreconstructionwithaccuracycomparabletothatattain-ablewithdirectknowledgeofthemostimportantcoefficients.Moreover,agoodapproximationtothoseimportantcoeffi-cientsisextractedfromthemeasurementsbysolvingalinearprogram—BasisPursuitinsignalprocessing.Thenonadaptivemeasurementshavethecharacterof“random”linearcombi-nationsofbasis/frameelements.Ourresultsusethenotionsofoptimalrecovery,of-widths,andinformation-basedcomplexity.WeestimatetheGel’fand-widthsofballsinhigh-dimensionalEuclideanspaceinthecase01,andgiveacriterionidentifyingnear-optimalsubspacesforGel’fand-widths.Weshowthat“most”subspacesarenear-optimal,andshowthatconvexoptimization(BasisPursuit)isanear-optimalwaytoextractinformationderivedfromthesenear-optimalsubspaces.IndexTerms—Adaptivesampling,almost-sphericalsectionsofBanachspaces,BasisPursuit,eigenvaluesofrandommatrices,Gel’fand-widths,information-basedcomplexity,integratedsensingandprocessing,minimum1-normdecomposition,op-timalrecovery,Quotient-of-a-Subspacetheorem,sparsesolutionoflinearequations.
theimportantinformationaboutthesignals/images—ineffect,notacquiringthatpartofthedatathatwouldeventuallyjustbe“thrownaway”bylossycompression.Moreover,theprotocolsarenonadaptiveandparallelizable;theydonotrequireknowl-edgeofthesignal/imagetobeacquiredinadvance—otherthanknowledgethatthedatawillbecompressible—anddonotat-temptany“understanding”oftheunderlyingobjecttoguideanactiveoradaptivesensingstrategy.Themeasurementsmadeinthecompressedsensingprotocolareholographic—thus,notsimplepixelsamples—andmustbeprocessednonlinearly.Inspecificapplications,thisprinciplemightenabledra-maticallyreducedmeasurementtime,dramaticallyreducedsamplingrates,orreduceduseofanalog-to-digitalconverterresources.
A.TransformCompressionBackground
Ourtreatmentisabstractandgeneral,butdependsononespe-cificassumptionwhichisknowntoholdinmanysettingsofsignalandimageprocessing:theprincipleoftransformsparsity.Wesupposethattheobjectofinterestisavector,whichcanbeasignalorimagewithsamplesorpixels,andthatthere
forwhichcanisanorthonormalbasis
be,forexample,anorthonormalwaveletbasis,aFourierbasis,oralocalFourierbasis,dependingontheapplication.(Asex-plainedlater,theextensiontotightframessuchascurveletorGaborframescomesforfree.)Theobjecthastransformcoeffi-,andtheseareassumedsparseinthesensecients
andforsomethat,forsome
(I.1)
Suchconstraintsareactuallyobeyedonnaturalclassesofsig-nalsandimages;thisistheprimaryreasonforthesuccessof
standardcompressiontoolsbasedontransformcoding[1].Tofixideas,wementiontwosimpleexamplesofconstraint.•BoundedVariationmodelforimages.Hereimagebright-nessisviewedasanunderlyingfunctiononthe
,whichobeys(essentially)unitsquare
I.INTRODUCTION
A
Sourmoderntechnology-drivencivilizationacquiresand
exploitsever-increasingamountsofdata,“everyone”nowknowsthatmostofthedataweacquire“canbethrownaway”withalmostnoperceptualloss—witnessthebroadsuccessoflossycompressionformatsforsounds,images,andspecializedtechnicaldata.Thephenomenonofubiquitouscompressibilityraisesverynaturalquestions:whygotosomuchefforttoac-quireallthedatawhenmostofwhatwegetwillbethrownaway?Canwenotjustdirectlymeasurethepartthatwillnotendupbeingthrownaway?
Inthispaper,wedesigncompresseddataacquisitionproto-colswhichperformasifitwerepossibletodirectlyacquirejust
ManuscriptreceivedSeptember18,2004;revisedDecember15,2005.
TheauthoriswiththeDepartmentofStatistics,StanfordUniversity,Stanford,CA94305USA(e-mail:donoho@stanford.edu).
CommunicatedbyA.Høst-Madsen,AssociateEditorforDetectionandEstimation.
DigitalObjectIdentifier10.1109/TIT.2006.871582
Thedigitaldataofinterestconsistofpixelsam-plesofproducedbyaveragingoverpixels.Wetakeawaveletpointofview;thedataareseenasasu-perpositionofcontributionsfromvariousscales.Letdenotethecomponentofthedataatscale,andlet
denotetheorthonormalbasisofwaveletsatscale,con-elements.Thecorrespondingcoefficientstaining
.obey
0018-9448/$20.00©2006IEEE
1290•
BumpAlgebramodelforspectra.Hereaspectrum(e.g.,massspectrumormagneticresonancespectrum)is
modeledasdigitalsamples
ofanunderlyingfunctiononthereallinewhichisasuperpositionofso-calledspectrallinesofvaryingpositions,amplitudes,andlinewidths.Formally
Heretheparametersarelinelocations,areampli-tudes/polarities,andarelinewidths,andrepresentsalineshape,forexampletheGaussian,althoughotherpro-filescouldbeconsidered.Weassumetheconstraintwhere
,whichinapplicationsrepresentsanen-ergyortotalmassconstraint.Againwetakeawaveletviewpoint,thistimespecificallyusingsmoothwavelets.Thedatacanberepresentedasasuperpositionofcon-tributionsfromvariousscales.Let
denotethecom-ponentofthespectrumatscale,andlet
denotetheorthonormalbasisofwaveletsatscale,containingelements.Thecorrespondingcoefficientsagainobey
,[2].
Whileinthesetwoexamples,theconstraintappeared,otherconstraintswithcanappearnaturallyaswell;seebelow.Forsomereaders,theuseofnormswithmayseeminitiallystrange;itisnowwellunderstoodthatthenormswithsuchsmallarenaturalmathematicalmeasuresofsparsity[3],[4].Asdecreasesbelow,moreandmoresparsity
isbeingrequired.Also,fromthisviewpoint,an
constraintbasedon
requiresnosparsityatall.Notethatineachoftheseexamples,wealsoallowedforsep-aratingtheobjectofinterestintosubbands,eachoneofwhich
obeysan
constraint.Inpractice,inthefollowingwestickwiththeviewthattheobjectofinterestisacoefficientvectorobeyingtheconstraint(I.1),whichmaymean,fromanapplica-tionviewpoint,thatourmethodscorrespondtotreatingvarioussubbandsseparately,asintheseexamples.
Thekeyimplicationofthe
constraintissparsityofthetransformcoefficients.Indeed,wehavetriviallythat,ifde-notesthevectorwitheverythingexceptthelargestcoeffi-cientssetto
(I.2)
for
,withaconstantdependingonlyon
.Thus,forexample,toapproximatewitherror,we
needtokeeponlythebiggesttermsin.B.OptimalRecovery/Information-BasedComplexity
Background
Ourquestionnowbecomes:ifisanunknownsignalwhose
transformcoefficientvector
obeys(I.1),canwemakeareducednumber
ofmeasurementswhichwillallowfaithfulreconstructionof.Suchquestionshavebeendiscussed(forothertypesofassumptionsabout)underthenamesofOptimalRecovery[5]andInformation-BasedComplexity[6];wenowadopttheirviewpoint,andpartiallyadopttheirnota-tion,withoutmakingaspecialefforttobereallyorthodox.We
IEEETRANSACTIONSONINFORMATIONTHEORY,VOL.52,NO.4,APRIL2006
use“OR/IBC”asagenericlabelforworktakingplaceinthosefields,admittedlybeinglessthanencyclopedicaboutvariousscholarlycontributions.
Wehaveaclass
ofpossibleobjectsofinterest,andareinterestedindesigninganinformationoperator
thatsamplespiecesofinformationabout,andanalgorithm
thatoffersanapproximatereconstructionof.
Heretheinformationoperatortakestheform
wherethearesamplingkernels,notnecessarilysamplingpixelsorothersimplefeaturesofthesignal;however,theyare
nonadaptive,i.e.,fixedindependentlyof.Thealgorithm
isanunspecified,possiblynonlinearreconstructionoperator.Weareinterestedintheerrorofreconstructionandinthebehaviorofoptimalinformationandoptimalalgorithms.Hence,weconsidertheminimaxerrorasastandardofcomparison
Sohere,allpossiblemethodsofnonadaptivelinearsamplingareallowed,andallpossiblemethodsofreconstructionareallowed.Inourapplication,theclassofobjectsofinterestistheset
ofobjects
whereobeys(I.1)foragivenand.Denotethen
Ourgoalistoevaluateandtohavepracticalschemeswhichcomeclosetoattainingit.C.FourSurprises
Hereisthemainquantitativephenomenonofinterestforthispaper.
Theorem1:Let
beasequenceofproblemsizeswith
,,and,,.Thenforthereis
sothat
(I.3)
Wefindthissurprisinginfourways.First,compare(I.3)with(I.2).Weseethattheformsaresimilar,underthecalibration
.Inwords:thequalityofapproximationto
whichcouldbegottenbyusingthebiggesttransformcoef-ficientscanbegottenbyusingthe
piecesofnonadaptiveinformationprovidedby.Thesurpriseisthatwewouldnotknowinadvancewhichtransformcoefficientsarelikelytobetheimportantonesinthisapproximation,yetthe
optimalinformationoperator
isnonadaptive,dependingatmostontheclassandnotonthespecificobject.Insomesensethisnonadaptiveinformationisjustaspowerfulasknowingthebesttransformcoefficients.
Thisseemsevenmoresurprisingwhenwenotethatforob-jects
,thetransformrepresentationistheoptimalone:nootherrepresentationcandoaswellatcharacterizingbyafewcoefficients[3],[7].Surelythen,oneimagines,
thesamplingkernels
underlyingtheoptimalinformationDONOHO:COMPRESSEDSENSINGoperatormustbesimplymeasuringindividualtransformcoef-ficients?Actually,no:theinformationoperatorismeasuringverycomplex“holographic”functionalswhichinsomesensemixtogetherallthecoefficientsinabigsoup.Compare(VI.1)below.(Holographyisaprocesswhereathree–dimensional(3-D)imagegeneratesbyinterferometryatwo–dimensional(2-D)transform.Eachvalueinthe2-Dtransformdomainisinfluencedbyeachpartofthewhole3-Dobject.The3-Dobjectcanlaterbereconstructedbyinterferometryfromallorevenapartofthe2-Dtransformdomain.Leavingasidethespecificsof2-D/3-Dandtheprocessofinterferometry,weperceiveananalogyhere,inwhichanobjectistransformedtoacompresseddomain,andeachcompresseddomaincomponentisacombinationofallpartsoftheoriginalobject.)
Anothersurpriseisthat,ifweenlargedourclassofinforma-tionoperatorstoallowadaptiveones,e.g.,operatorsinwhichcertainmeasurementsaremadeinresponsetoearliermeasure-ments,wecouldscarcelydobetter.Definetheminimaxerror
underadaptiveinformation
allowingadaptiveoperatorswhere,for,eachkernelisallowedtodependontheinformationgatheredatpreviousstages.Formallysetting
wehavethefollowing.Theorem2:For
,thereis
sothatfor
Soadaptiveinformationisofminimalhelp—despitethequitenaturalexpectationthatanadaptivemethodoughttobeableiterativelysomehow“localize”andthen“closein”onthe“bigcoefficients.”
Anadditionalsurpriseisthat,inthealready-interestingcase,Theorems1and2areeasilyderivablefromknownresultsinOR/IBCandapproximationtheory!However,thederivationsareindirect;soalthoughtheyhavewhatseemtotheauthorasfairlyimportantimplications,verylittleseemsknownatpresentaboutgoodnonadaptiveinformationoperatorsoraboutconcretealgorithmsmatchedtothem.
Ourgoalinthispaperistogivedirectargumentswhichcover
thecase
ofhighlycompressibleobjects,togivedi-rectinformationaboutnear-optimalinformationoperatorsandaboutconcrete,computationallytractablealgorithmsforusingthisnear-optimalinformation.D.GeometryandWidths
Fromourviewpoint,thephenomenondescribedinThe-orem1concernsthegeometryofhigh-dimensionalconvexandnonconvex“balls.”Toseetheconnection,notethattheclass
istheimage,underorthogonaltransformation,of
anball.Ifthisisconvexandsymmetricabouttheorigin,aswellasbeingorthosymmetricwithrespecttotheaxes
1291
providedbythewaveletbasis;if
,thisisagainsymmetricabouttheoriginandorthosymmetric,whilenotbeingconvex,butstillstar-shaped.
Todevelopthisgeometricviewpointfurther,weconsidertwonotionsof-width;see[5].
Definition1.1:TheGel’fand-widthof
withrespectto
the
normisdefinedaswheretheinfimumisover-dimensionallinearsubspacesof,anddenotestheorthocomplementofwithrespecttothestandardEuclideaninnerproduct.
Inwords,welookforasubspacesuchthat“trapping”
inthatsubspacecausestobesmall.OurinterestinGel’fand-widthsderivesfromanequivalencebetweenoptimalrecoveryfornonadaptiveinformationandsuch-widths,wellknownin
the
case[5],andinthepresentsettingextendingasfollows.
Theorem3:For
and
(I.4)(I.5)
ThustheGel’fand-widthseitherexactlyornearlyequalthevalueofoptimalinformation.Ultimately,thebracketingwith
constant
willbeforusjustasgoodasequality,owingtotheunspecifiedconstantfactorsin(I.3).Wewilltypicallyonlybeinterestedinnear-optimalperformance,i.e.,inobtainingtowithinconstantfactors.
ItisrelativelyraretoseetheGel’fand-widthsstudieddirectly[8];morecommonly,oneseesresultsabouttheKolmogorov-widths.Definition1.2:Letbeaboundedset.The
Kolmogorov-widthofwithrespectthenormisdefined
as
wheretheinfimumisover-dimensionallinearsubspaces
of.
Inwords,measuresthequalityofapproximationofpossibleby-dimensionalsubspaces.
Inthecase
,thereisanimportantdualityrelationshipbetweenKolmogorovwidthsandGel’fandwidthswhichallows
ustoinferpropertiesof
frompublishedresultson.Tostateit,letbedefinedintheobviousway,basedonapproximationinratherthannorm.Also,forgivenand,letandbethestandarddualindices
,.Also,letdenotethestandardunit
ballof.Then[8]
(I.6)
Inparticular
1292Theasymptoticpropertiesoftheleft-handsidehavebeende-terminedbyGarnaevandGluskin[9].ThisfollowsmajorworkbyKashin[10],whodevelopedaslightlyweakerversionofthisresultinthecourseofdeterminingtheKolmogorov-widthsofSobolevspaces.Seetheoriginalpapers,orPinkus’sbook[8]formoredetails.
Theorem4:(Kashin,Garnaev,andGluskin(KGG))Foralland
Theorem1nowfollowsinthecasebyapplyingKGGwiththedualityformula(I.6)andtheequivalenceformula(I.4).
Thecase
ofTheorem1doesnotallowuseofdualityandthewholerangeisapproacheddifferentlyinthispaper.
E.Mysteries…
BecauseoftheindirectmannerbywhichtheKGGresultim-pliesTheorem1,wereallydonotlearnmuchaboutthephenom-enonofinterestinthisway.TheargumentsofKashin,Garnaev,andGluskinshowthatthereexistnear-optimal-dimensionalsubspacesfortheKolmogorovwidths;theyariseasthenull
spacesofcertainmatriceswithentries
whichareknowntoexistbycountingthenumberofmatriceslackingcertainprop-erties,thetotalnumberofmatriceswith
entries,andcom-paring.Theinterpretabilityofthisapproachislimited.
Theimplicitnessoftheinformationoperatorismatchedbytheabstractnessofthereconstructionalgorithm.BasedonOR/IBCtheoryweknowthattheso-calledcentralalgorithmisoptimal.This“algorithm”asksustoconsider,forgiven
information
,thecollectionofallobjectswhichcouldhavegivenrisetothedata
Definingnowthecenterofaset
center
thecentralalgorithmis
center
anditobeys,whentheinformationisoptimal,
seeSectionIIIbelow.
Thisabstractviewpointunfortunatelydoesnottranslateinto
apracticalapproach(atleastinthecaseofthe
,).Thesetisasectionoftheball,andfindingthecenterofthissectiondoesnotcorre-spondtoastandardtractablecomputationalproblem.Moreover,thisassumesweknowand,whichwouldtypicallynotbethecase.
IEEETRANSACTIONSONINFORMATIONTHEORY,VOL.52,NO.4,APRIL2006
F.Results
Ourpaperdevelopstwomaintypesofresults.
•Near-OptimalInformation.Wedirectlyconsidertheproblemofnear-optimalsubspacesforGel’fand-widths
of
,andintroducethreestructuralconditions(CS1–CS3)onan-by-matrixwhichimplythatitsnullspaceisnear-optimal.Weshowthatthevastmajority
of-subspacesof
arenear-optimal,andrandomsamplingyieldsnear-optimalinformationoperatorswithoverwhelminglyhighprobability.
•Near-OptimalAlgorithm.Westudyasimplenonlinearre-constructionalgorithm:simplyminimizethenormofthecoefficientssubjecttosatisfyingthemeasurements.ThishasbeenstudiedinthesignalprocessingliteratureunderthenameBasisPursuit;itcanbecomputedbylinearprogramming.Weshowthatthismethodgivesnear-op-timalresultsforall
.Inshort,weprovidealargesupplyofnear-optimalinfor-mationoperatorsandanear-optimalreconstructionprocedurebasedonlinearprogramming,which,perhapsunexpectedly,
worksevenforthenonconvexcase
.Foratasteofthetypeofresultweobtain,consideraspecificinformation/algorithmcombination.
•CSInformation.Letbean
matrixgeneratedbyrandomlysamplingthecolumns,withdifferentcolumnsindependentandidenticallydistributed(i.i.d.)random
uniformon
.Withoverwhelmingprobabilityforlarge,
haspropertiesCS1–CS3discussedindetailinSectionII-Abelow;assumewehaveachievedsucha
favorabledraw.Letbethe
basismatrixwithbasisvector
asthethcolumn.TheCSInformationoperator
isthematrix.•-minimization.ToreconstructfromCSInformation,wesolvetheconvexoptimizationproblem
subjectto
Inwords,welookfortheobjecthavingcoefficientswithsmallestnormthatisconsistentwiththeinforma-tion.
Toevaluatethequalityofaninformationoperator,set
Toevaluatethequalityofacombinedalgorithm/information
pair
,setTheorem5:Let,beasequenceofproblemsizes
obeying
,,;andletbeacorrespondingsequenceofoperatorsderivingfromCSmatriceswithunderlyingparametersand(seeSectionIIbelow).Let
.Thereexistssothatisnear-optimal:
DONOHO:COMPRESSEDSENSINGfor,.Moreover,thealgorithmdelivering
thesolutiontoisnear-optimal:
for,.
Thus,forlarge,wehaveasimpledescriptionofnear-op-timalinformationandatractablenear-optimalreconstructionalgorithm.
G.PotentialApplications
Toseethepotentialimplications,recallfirsttheBumpAl-gebramodelforspectra.Inthiscontext,ourresultsaysthat,foraspectrometerbasedontheinformationoperatorinThe-orem5,itisreallyonlynecessarytotake
measurementstogetanaccuratereconstructionofsuchspectra,ratherthanthenominalmeasurements.However,theymustthenbeprocessednonlinearly.
RecalltheBoundedVariationmodelforimages.Inthatcon-text,aresultparallelingTheorem5saysthatforaspecializedimagingdevicebasedonanear-optimalinformationoperatorit
isreallyonlynecessarytotake
measure-mentstogetanaccuratereconstructionofimageswithpixels,ratherthanthenominalmeasurements.
Thecalculationsunderlyingtheseresultswillbegivenbelow,alongwitharesultshowingthatforcartoon-likeimages(whichmaymodelcertainkindsofsimplenaturalimagery,likebrainscans),thenumberofmeasurementsforan-pixelimageis
only.H.Contents
SectionIIintroducesasetofconditionsCS1–CS3for
near-optimalityofaninformationoperator.SectionIIIconsidersabstractnear-optimalalgorithms,andprovesTheorems1–3.SectionIVshowsthatsolvingtheconvexoptimizationproblem
givesanear-optimalalgorithmwhenever.Sec-tionVpointsoutimmediateextensionstoweak-conditionsandtotightframes.SectionVIsketchespotentialimplicationsinimage,signal,andarrayprocessing.SectionVII,buildingonworkin[11],showsthatconditionsCS1–CS3aresatisfiedfor“most”informationoperators.
Finally,inSectionVIII,wenotetheongoingworkbytwogroups(Gilbertetal.[12]andCandèsetal.[13],[14]),whichalthoughnotwritteninthe-widths/OR/IBCtradition,imply(asweexplain),closelyrelatedresults.
II.INFORMATION
Considerinformationoperatorsconstructedasfollows.Withtheorthogonalmatrixwhosecolumnsarethebasiselements,andwithcertain-by-matricesobeyingconditionsspecifiedbelow,weconstructcorrespondinginformationoper-ators
.Everythingwillbecompletelytransparenttothechoiceoforthogonalmatrixandhencewewillassumethatistheidentitythroughoutthissection.
InviewoftherelationbetweenGel’fand-widthsandmin-imaxerrors,wemayworkwith-widths.Letdenoteas
1293
usualthenullspace
.Wedefinethewidthofaset
relativetoanoperator
subjectto
(II.1)
Inwords,thisistheradiusofthesectionof
cutoutbythenullspace.Ingeneral,theGel’fand-widthisthesmallestvalueofobtainablebychoiceof
isan
matrix
Wewillshowforalllargeand
theexistenceof
by
matriceswhere
withdependentatmostonandtheratio.
A.ConditionsCS1–CS3
Inthefollowing,withletdenoteasub-matrixofobtainedbyselectingjusttheindicatedcolumnsof.Weletdenotetherangeofin.Finally,weconsiderafamilyofquotientnormson;withdenotingthenormonvectorsindexedby
subjectto
Thesedescribetheminimal-normrepresentationofachiev-ableusingonlyspecifiedsubsetsofcolumnsof.
Wedefinethreeconditionstoimposeonan
matrix,indexedbystrictlypositiveparameters,
,and.CS1:Theminimalsingularvalueofexceeds
uniformlyin.
CS2:Oneachsubspace
wehavetheinequalityuniformlyin
.
CS3:Oneachsubspace
uniformlyin.
CS1demandsacertainquantitativedegreeoflinearindepen-denceamongallsmallgroupsofcolumns.CS2saysthatlinearcombinationsofsmallgroupsofcolumnsgivevectorsthatlookmuchlikerandomnoise,atleastasfarasthecomparisonofandnormsisconcerned.Itwillbeimpliedbyageometric
fact:every
slicesthroughtheballinsuchawaythattheresultingconvexsectionisactuallyclosetospherical.CS3saysthatforeveryvectorinsome,theassociatedquotient
normisneverdramaticallysmallerthanthesimplenormon
.Itturnsoutthatmatricessatisfyingtheseconditionsareubiq-uitousforlargeandwhenwechoosetheandproperly.Ofcourse,foranyfiniteand,allnormsareequivalentandalmostanyarbitrarymatrixcantriviallysatisfytheseconditionssimplybytakingverysmalland,verylarge.However,thedefinitionof“verysmall”and“verylarge”wouldhaveto
1294dependonforthistrivialargumenttowork.Weclaimsome-thingdeeperistrue:itispossibletochooseandindependent
ofandof
.Considertheset
ofallmatriceshavingunit-normalizedcolumns.Onthisset,measurefrequencyofoccurrencewiththenaturaluniform
measure(theproductmeasure,uniformoneachfactor).Theorem6:Letbeasequenceofproblemsizeswith
,,and,,and.There
exist
anddependingonlyonandsothat,foreach
theproportionofallmatricessatisfyingCS1–CS3withparametersandeventuallyexceeds.WewilldiscussandprovethisinSectionVII.Theproofwillshowthattheproportionofmatricesnotsatisfyingtheconditiondecaysexponentiallyfastin.
Forlateruse,wewillleavetheconstantsandimplicitandspeaksimplyofCSmatrices,meaningmatricesthatsatisfythegivenconditionswithvaluesofparametersofthetypedescribedbythistheorem,namely,withandnotdependingonandpermittingtheaboveubiquity.B.Near-OptimalityofCSMatrices
WenowshowthattheCSconditionsimplynear-optimalityofwidthsinducedbyCSmatrices.
Theorem7:Let
beasequenceofproblemsizeswithand.Considerasequenceofby
matricesobeyingtheconditionsCS1–CS3withandpositiveandindependentof.Thenforeach,thereis
sothatfor
Proof:Considertheoptimizationproblem
subjectto
Ourgoalistoboundthevalueof
Choosesothat.Letdenotetheindicesofthe
largestvaluesin.Withoutlossofgenerality,
supposecoordinatesareorderedsothatcomesfirstamong
theentries,andpartition
.Clearly(II.2)
while,becauseeachentryin
isatleastasbigasanyentryin
,(I.2)gives
(II.3)
IEEETRANSACTIONSONINFORMATIONTHEORY,VOL.52,NO.4,APRIL2006
Asimilarargumentforapproximationgives,incase
(II.4)
Now
.Hence,with,wehave.As
and,wecan
invokeCS3,getting
Ontheotherhand,againusing
and
,invokeCS2,getting
Combiningthesewiththeabove
with.Recalling,
andinvokingCS1wehave
Inshort,with
Thetheoremfollowswith
III.ALGORITHMS
Givenaninformationoperator,wemustdesignarecon-structionalgorithm
whichdeliversreconstructionscompat-ibleinqualitywiththeestimatesfortheGel’fand-widths.AsdiscussedintheIntroduction,theoptimalmethodintheOR/IBCframeworkistheso-calledcentralalgorithm,whichunfortu-nately,istypicallynotefficientlycomputableinoursetting.Wenowdescribeanalternateabstractapproach,allowingustoproveTheorem1.A.Feasible-PointMethods
AnothergeneralabstractalgorithmfromtheOR/IBClitera-tureistheso-calledfeasible-pointmethod,whichaimssimplytofindanyreconstructioncompatiblewiththeobservedinfor-mationandconstraints.
Asinthecaseofthecentralalgorithm,weconsider,forgiven
information
,thecollectionofallobjectswhichcouldhavegivenrisetotheinformation
DONOHO:COMPRESSEDSENSING1295
Inthefeasible-pointmethod,wesimplyselectanymemberof
,bywhatevermeans.Onecanshow,adaptingstandard
OR/IBCargumentsin[15],[6],[8]thefollowing.Lemma3.1:Letwhereisanoptimalinformationoperator,andlet
.Thenfor
and
beanyelementof
withand
whereforgiven
B.ProofofTheorem3
(III.1)
Beforeproceeding,itisconvenienttoproveTheorem3.Note
iswellknowninOR/IBCsoweonlyneedtothatthecase
(thoughithappensthatourargumentgiveanargumentfor
aswell).Thekeypointwillbetoapplytheworksfor
-triangleinequality
Inshort,anyfeasiblepointiswithinafactortwoofoptimal.
Proof:Wefirstjustifyourclaimsforoptimalityofthecen-tralalgorithm,andthenshowthatafeasiblepointisneartothe
denotetheresultofthecentralcentralalgorithm.Letagain
algorithm.Nowradius
Nowclearly,inthespecialcasewhenisonlyknowntoliein
andismeasured,theminimaxerroris
exactlyradius.Sincethiserrorisachievedbythecentralalgorithmforeachsuch,theminimaxerroroverallisachievedbythecentralalgorithm.Thisminimaxerroris
radius
validfor;thisinequalityiswellknownininterpola-tiontheory[17]throughPeetreandSparr’swork,andiseasytoverifydirectly.
Supposewithoutlossofgeneralitythatthereisanoptimalsubspace,whichisfixedandgiveninthisproof.Aswejustsaw
radius
Now
radius
Nowthefeasiblepointobeys
radius
;hence,
soclearly.Nowsupposewithoutlossofgeneralitythat
andattaintheradiusbound,i.e.,theysatisfyand,forcentertheysatisfy
Butthetriangleinequalitygives
Thendefine.Set.Bythe-triangleinequality
radius
radius
and
hence,as
andso
Hence
Moregenerally,iftheinformationoperatoroptimal,thenthesameargumentgives
isonlynear-(III.2)
Apopularchoiceoffeasiblepointistotakeanelementofleastnorm,i.e.,asolutionoftheproblem
subjectto
wherehereisthevectoroftransformcoefficients,
.Anicefeatureofthisapproachisthatitisnotnecessary
;theelementofleasttoknowtheradiusoftheball
normwillalwayslieinsideit.Forlateruse,callthesolution
.Bytheprecedinglemma,thisprocedureisnear-minimax:
sobelongstoand
.However,
.Hence,
C.ProofofTheorem1
WearenowinapositiontoproveTheorem1oftheIntroduction.
,wehavealreadyexplainedintheFirst,inthecase
IntroductionthatthetheoremofGarnaevandGluskinimplies
1296theresultbyduality.Inthecase,weneedonlytoshowalowerboundandanupperboundofthesameorder.Forthelowerbound,weconsidertheentropynumbers,de-finedasfollows.Letbeasetandlet
bethesmallestnumbersuchthatan-netfor
canbebuiltusinganetofcardinalityatmost.FromCarl’stheorem[18]—seetheex-positioninPisier’sbook[19]—thereisaconstant
sothattheGel’fand-widthsdominatetheentropynumbers.
Secondly,theentropynumbersobey[20],[21]
Atthesametime,thecombinationofTheorems7and6shows
that
ApplyingnowtheFeasiblePointmethod,wehave
withimmediateextensionstoforall.
Weconcludethat
aswastobeproven.D.ProofofTheorem2
NowisanopportunetimetoproveTheorem2.Wenotethat
inthecaseof
,thisisknown[22]–[25].Theargumentisthesamefor
,andwesimplyrepeatit.Supposethat,andconsidertheadaptivelyconstructedsubspaceac-cordingtowhateveralgorithmisinforce.Whenthealgorithmterminates,wehavean-dimensionalinformationvectorand
asubspace
consistingofobjectswhichwouldallgivethatinformationvector.Forallobjectsin,theadaptiveinforma-tionthereforeturnsoutthesame.Nowtheminimaxerrorasso-ciatedwiththatinformationisexactlyradius;butthiscannotbesmallerthan
radius
Theresultfollowsbycomparing
with.
IV.BASISPURSUIT
Theleast-normmethodoftheprevioussectionhastwodraw-backs.First,itrequiresthatoneknow;wepreferanalgo-rithmwhichworksfor
.Second,if,theleast-normprobleminvokesanonconvexoptimizationproce-dure,andwouldbeconsideredintractable.Inthissection,wecorrectbothdrawbacks.
IEEETRANSACTIONSONINFORMATIONTHEORY,VOL.52,NO.4,APRIL2006
A.TheCase
Inthecase,isaconvexoptimizationproblem.Writteninanequivalentform,withbeingtheoptimizationvariable,gives
subjectto
Thiscanbeformulatedasalinearprogrammingproblem:letbethebymatrix.Thelinearprogram
LP
subjectto
(IV.1)
hasasolution,say,avectorinwhichcanbepartitionedas;thensolves.Therecon-struction
is.Thislinearprogramistypicallyconsid-eredcomputationallytractable.Infact,thisproblemhasbeenstudiedinthesignalanalysisliteratureunderthenameBasisPursuit[26];inthatwork,verylarge-scaleunderdetermined
problems—e.g.,with
and—weresolvedsuccessfullyusinginterior-pointoptimizationmethods.
Asfarasperformancegoes,wealreadyknowthatthispro-cedureisnear-optimalincase
;from(III.2)wehavethefollowing.
Corollary4.1:Supposethatisaninformationoperator
achieving,forsome
thentheBasisPursuitalgorithmachieves,for
all
Inparticular,wehaveauniversalalgorithmfordealingwith
anyclass
—i.e.,any,any,any.First,applyanear-optimalinformationoperator;second,reconstructbyBasisPursuit.Theresultobeys
foraconstantdependingatmoston.The
inequalitycanbeputtouseasfollows.Fix
.Supposetheunknownobjectisknowntobehighlycompressible,
sayobeyingtheaprioribound
,.Let.Foranysuchobject,ratherthanmaking
measurements,weonlyneedtomake
measurements,andourreconstructionobeys
Whilethecaseisalreadysignificantandinteresting,the
case
isofinterestbecauseitcorrespondstodatawhicharemorehighlycompressible,offeringmoreimpressiveperformanceinTheorem1,becausetheexponentisevenstrongerthaninthecase.Laterinthissection,
weextendthesameinterpretationof
toperformanceoverthroughout.
DONOHO:COMPRESSEDSENSINGB.RelationBetweenandMinimization
ThegeneralOR/IBCtheorywouldsuggestthattohandle
caseswhere
,wewouldneedtosolvethenonconvexoptimizationproblem,whichseemsintractable.However,inthecurrentsituationatleast,asmallmiraclehappens:solving
isagainnear-optimal.Tounderstandthis,wefirsttakeasmalldetour,examiningtherelationbetweenandtheextremecaseofthespaces.Letusdefine
subjectto
whereisjustthenumberofnonzerosin.Again,sincetheworkofPeetreandSparr[16],theimportanceofandthe
relationwithfor
iswellunderstood;see[17]formoredetail.
Ordinarily,solvingsuchaprobleminvolvingthenormre-quirescombinatorialoptimization;oneenumeratesallsparse
subsetsof
searchingforonewhichallowsasolu-tion
.However,whenhasasparsesolution,willfindit.
Theorem8:SupposethatsatisfiesCS1–CS3withgivenpositiveconstants,.Thereisaconstantdepending
onlyonand
andnotonorsothat,ifsomesolutiontohasatmostnonzeros,thenand
bothhavethesameuniquesolution.
Inwords,althoughthesystemofequationsismassivelyunderdetermined,minimizationandsparsesolutioncoin-cide—whentheresultissufficientlysparse.
Thereisbynowanextensiveliteratureexhibitingresultson
equivalenceof
andminimization[27]–[34].Intheearlyliteratureonthissubject,equivalencewasfoundundercondi-tionsinvolvingsparsityconstraintsallowing
nonzeros.Whileitmayseemsurprisingthatanyresultsofthiskindare
possible,thesparsityconstraint
is,ultimately,disappointinglysmall.Amajorbreakthroughwasthecontribu-tionofCandès,Romberg,andTao[13]whichstudiedthema-tricesbuiltbytakingrowsatrandomfromanbyFourier
matrixandgaveanorder
bound,showingthatdramaticallyweakersparsityconditionswereneededthanthe
resultsknownpreviously.In[11],itwasshownthat
for’nearlyall’by
matriceswith,equiv-alenceheldfor
nonzeros,.Theabovere-sultsayseffectivelythatfor’nearlyall’bymatriceswith
,equivalencehelduptononzeros,
where
.Ourargument,inparallelwith[11],showsthatthenullspace
hasaveryspecialstructureforobeyingtheconditions
inquestion.Whenissparse,theonlyelementinagivenaffine
subspace
whichcanhavesmallnormisitself.ToproveTheorem8,wefirstneedalemmaaboutthenon-sparsityofelementsinthenullspaceof.Letand,foragivenvector,letdenotethemutilated
vectorwithentries
.Definetheconcentration1297
Thismeasuresthefractionofnormwhichcanbeconcen-tratedonacertainsubsetforavectorinthenullspaceof.This
concentrationcannotbelargeif
issmall.Lemma4.1:SupposethatsatisfiesCS1–CS3,withcon-stantsand.Thereisaconstant
dependingonthesothatifsatisfiesthen
Proof:ThisisavariationontheargumentforTheorem7.
Let
.Assumewithoutlossofgeneralitythatisthemostconcentratedsubsetofcardinality
,andthatthecolumnsofarenumberedsothat
;partition.Weagainconsider,andhave
.WeagaininvokeCS2–CS3,getting
WeinvokeCS1,gettingNow,ofcourse,
.Combiningallthese
Thelemmafollows,setting.
ProofofTheorem8:Supposethatandissup-portedonasubset.
Wefirstshowthatif
,istheonlyminimizerof.Supposethatisasolutionto,obeyingThen
where
.WehaveInvokingthedefinitionoftwice
Now
gives
andwehave
i.e.,.
Nowrecalltheconstant
ofLemma4.1.Defineso
that
and.Lemma4.1showsthat
implies
.
C.Near-OptimalityofBasisPursuitfor
Wenowreturntotheclaimednear-optimalityofBasisPursuitthroughouttherange.Theorem9:SupposethatsatisifiesCS1–CS3withcon-stantsand.Thereissothatasolu-tiontoaprobleminstanceofwithobeysTheproofrequiresanstabilitylemma,showingthesta-bilityofminimizationundersmallperturbationsasmeasuredinnorm.Forandstabilitylemmas,see[33]–[35];how-ever,notethatthoselemmasdonotsufficeforourneedsinthisproof.
1298Lemma4.2:Letbeavectorinand
bethecorre-spondingmutilatedvectorwithentries.Supposethat
where.Considertheinstanceof
definedby;thesolutionofthisinstanceof
obeys
(IV.2)
ProofofLemma4.2:Putforshort,andset
.Bydefinitionofwhile
Assolves
andofcourseHence,
Finally
Combiningtheabove,setting,and
,
weget
and(IV.2)follows.
ProofofTheorem9:Weusethesamegeneralframework
asinTheorem7.Let
where.Letbethesolutionto
,andset.Let
asinLemma4.1andset.Letindexthelargestamplitudeentriesin.From
and(II.4)wehaveandLemma4.1providesApplyingLemma4.2
(IV.3)
Thevectorliesin
andhas
.
Hence,
WeconcludebyhomogeneitythatCombiningthiswith(IV.3),
IEEETRANSACTIONSONINFORMATIONTHEORY,VOL.52,NO.4,APRIL2006
V.IMMEDIATEEXTENSIONS
Beforecontinuing,wementiontwoimmediateextensionstotheresultssofar;theyareofinterestbelowandelsewhere.A.TightFrames
Ourmainresultssofarhavebeenstatedinthecontextofmakinganorthonormalbasis.Infact,theresultsholdfortightframes.Thesearecollectionsofvectorswhich,whenjoinedto-getherascolumnsinan
matrixhave.Itfollowsthat,if,thenwehavethe
Parsevalrelation
andthereconstructionformula.Infact,Theorems7and9onlyneedtheParsevalrelationintheproof.Hence,thesameresultsholdwithoutchangewhentherelationbetweenandinvolvesatightframe.Inparticular,ifisanmatrix
satisfyingCS1–CS3,then
definesanear-optimalinformationoperatoron
,andsolutionoftheoptimizationproblem
definesanear-optimalreconstructionalgorithm.
Arefereeremarkedthatthereisnoneedtorestrictattentiontotightframeshere;ifwehaveageneralframethesameresultsgothrough,withconstantsinvolvingtheframebounds.Thisistrueandpotentiallyveryuseful,althoughwewillnotuseitinwhatfollows.B.Weak
Balls
Ourmainresultssofarhavebeenstatedforspaces,butthe
proofsholdforweakballsaswell
.Theweakballofradiusconsistsofvectorswhosedecreasingrearrange-ments
obeyConversely,foragiven,thesmallestforwhichthesein-equalitiesallholdisdefinedtobethenorm:
.The“weak”monikerderivesfrom
.Weakcon-straintshavethefollowingkeyproperty:if
denotesamuti-latedversionofthevectorwithallexceptthelargestitemssettozero,thentheinequality
(V.1)
isvalidforand,with.Infact,Theorems7and9onlyneeded(V.1)intheproof,togetherwith
(implicitly)
.Hence,wecanstateresultsforspaces
definedusingonlyweak-norms,andtheproofsapplywithoutchange.
VI.STYLIZEDAPPLICATIONS
Wesketchthreepotentialapplicationsoftheaboveabstracttheory.Ineachcase,weexhibitthatacertainclassoffunctions
DONOHO:COMPRESSEDSENSINGhasexpansioncoefficientsinabasisorframethatobeyapartic-ularorweakembedding,andthenapplytheaboveabstracttheory.
A.BumpAlgebra
Considertheclassoffunctionswhicharerestrictionstotheunitintervaloffunctionsbelongingtothe
BumpAlgebra[2],withbumpnorm
.ThiswasmentionedintheIntroduction,whichobservedthatthewavelet
coefficientsatlevelobey
wherede-pendsonlyonthewaveletused.Hereandlaterweusestandardwaveletanalysisnotationsasin[36],[37],[2].
Weconsidertwowaysofapproximatingfunctionsin.Intheclassiclinearscheme,wefixa“finestscale”andmea-suretheresumécoefficients
where,withasmoothfunctionintegratingto.
Thinkoftheseaspointsamplesatscaleafterapplyinganantialiasingfilter.Wereconstructbygivinganapproximationerror
withdependingonlyonthechosenwavelet.Thereare
coefficientsassociatedwiththeunitinterval,andsotheapproximationerrorobeys
Inthecompressedsensingscheme,weneedalsowavelets
whereisanoscillatingfunctionwith
meanzero.Wepickacoarsestscale
.Wemeasuretheresumécoefficients—thereareofthese—andthenlet
denoteanenumerationofthedetailwaveletcoeffi-cients
.Thedimensionofis.Thenormsatisfies
Thisestablishestheconstraintonnormneededforourtheory.
Wetake
andapplyanear-optimalinforma-tionoperatorforthisand(describedinmoredetaillater).Weapplythenear-optimalalgorithmofminimization,gettingtheerrorestimate
withindependentof.Theoverallreconstruction
haserror
againwithindependentof.Thisisofthesame
orderofmagnitudeastheerroroflinearsampling.
1299
Thecompressedsensingschemetakesatotalof
samplesofresumécoefficientsand
samplesassoci-atedwithdetailcoefficients,foratotal
piecesofinformation.Itachieveserrorcomparabletoclassicalsampling
basedon
samples.Thus,itneedsdramaticallyfewersam-plesforcomparableaccuracy:roughlyspeaking,onlythecuberootofthenumberofsamplesoflinearsampling.
Toachievethisdramaticreductioninsampling,weneedaninformationoperatorbasedonsomesatisfyingCS1–CS3.Theunderlyingmeasurementkernelswillbeoftheform
(VI.1)
wherethecollection
issimplyanenumerationofthe
wavelets
,withand.
B.ImagesofBoundedVariation
WeconsidernowthemodelwithimagesofBoundedVaria-tionfromtheIntroduction.Let
denotetheclassoffunc-tions
withdomain,havingtotalvariationatmost[38],andboundedinabsolutevalueby
aswell.IntheIntroduction,itwasmentionedthatthewaveletco-efficientsatlevelobey
wheredependsonlyonthewaveletused.Itisalsotruethat
.Weagainconsidertwowaysofapproximatingfunctionsin.Theclassiclinearschemeusesa2-Dversionoftheschemewe
havealreadydiscussed.Weagainfixa“finestscale”
andmeasuretheresumécoefficients
wherenowisapairofintegers,.indexing
position.WeusetheHaarscalingfunction
Wereconstructby
givinganapproxima-
tionerror
Therearecoefficientsassociatedwiththeunitsquare,andsotheapproximationerrorobeys
Inthecompressedsensingscheme,weneedalsoHaar
wavelets
whereisanoscillatingfunctionwithmeanzerowhichiseitherhorizontallyoriented
,verticallyoriented,ordiagonallyoriented.Wepicka“coarsestscale”,andmeasure
theresumécoefficients
—thereareofthese.Thenletbetheconcatenationofthedetailwaveletcoeffi-cients
.Thedimensionofis
.ThenormobeysThisestablishestheconstraintonnormneededforapplying
ourtheory.Wetake
andapplyanear-op-timalinformationoperatorforthisand.Weapplythenear-
1300optimalalgorithmofminimizationtotheresultinginforma-tion,gettingtheerrorestimate
withindependentof.Theoverallreconstruction
haserror
againwithindependentof.Thisisofthesame
orderofmagnitudeastheerroroflinearsampling.
Thecompressedsensingschemetakesatotalof
samplesofresumécoefficientsand
samplesassoci-atedwithdetailcoefficients,foratotalpiecesofmeasuredinformation.Itachieveserrorcomparabletoclas-sicalsamplingwith
samples.Thus,justaswehaveseenintheBumpAlgebracase,weneeddramaticallyfewersamplesforcomparableaccuracy:roughlyspeaking,onlythesquarerootofthenumberofsamplesoflinearsampling.C.Piecewise
ImagesWith
Edges
Wenowconsideranexamplewhere
,andwecanapplytheextensionstotightframesandtoweak-mentionedearlier.Againintheimageprocessingsetting,weusethe-modeldiscussedinCandèsandDonoho[39],[40].Considerthecol-lection
ofpiecewisesmooth,withvalues,firstandsecondpartialderivativesboundedby,away
fromanexceptionalsetwhichisaunionof
curveshavingfirstandsecondderivatives
inanappropriateparametriza-tion;thecurveshavetotallength.Morecolorfully,suchimagesarecartoons—patchesofuniformsmoothbehaviorseparatedbypiecewise-smoothcurvilinearboundaries.Theymightbereasonablemodelsforcertainkindsoftechnicalimagery—e.g.,inradiology.
Thecurveletstightframe[40]isacollectionofsmoothframeelementsofferingaParsevalrelation
andreconstructionformula
Theframeelementshaveamultiscaleorganization,andframe
coefficients
groupedbyscaleobeytheweakconstraintcompare[40].Forsuchobjects,classicallinearsamplingat
scalebysmooth2-Dscalingfunctionsgives
IEEETRANSACTIONSONINFORMATIONTHEORY,VOL.52,NO.4,APRIL2006
Thisisnobetterthantheperformanceoflinearsamplingfor
theBoundedVariationcase,despitethepiecewise
characterof;thepossiblediscontinuitiesinareresponsiblefortheinabilityoflinearsamplingtoimproveitsperformanceover
comparedtoBoundedVariation.
Inthecompressedsensingscheme,wepickacoarsestscale
.Wemeasuretheresumécoefficientsina
smoothwaveletexpansion—thereare
ofthese—andthenletdenoteaconcatentationofthefinerscale
curveletcoefficients
.Thedimensionofis,withduetoovercompletenessof
curvelets.Theweak
“norm”obeyswithdependingonand.Thisestablishestheconstraintonweaknormneededforourtheory.Wetake
andapplyanear-optimalinformationoperatorforthisand.Weapplythenear-optimalalgorithmofminimizationtotheresultinginformation,gettingtheerrorestimate
withabsolute.Theoverallreconstruction
haserror
againwithindependentof.Thisisofthesameorderofmagnitudeastheerroroflinearsampling.
Thecompressedsensingschemetakesatotalof
samplesofresumécoefficientsand
samplesassoci-atedwithdetailcoefficients,foratotalpiecesofinformation.Itachieveserrorcomparabletoclassicalsampling
basedon
samples.Thus,evenmoresothanintheBumpAl-gebracase,weneeddramaticallyfewersamplesforcomparableaccuracy:roughlyspeaking,onlythefourthrootofthenumberofsamplesoflinearsampling.
VII.NEARLYALLMATRICESARECSMATRICESWemayreformulateTheorem6asfollows.
Theorem10:Let,beasequenceofproblemsizeswith
,,forand.Letbeamatrixwithcolumnsdrawnindependentlyanduniformlyatrandomfrom.Thenforsomeand
,conditionsCS1–CS3holdforwithoverwhelmingprobabilityforalllarge.
DONOHO:COMPRESSEDSENSINGIndeed,notethattheprobabilitymeasureoninducedby
samplingcolumnsi.i.d.uniformon
isexactlythenaturaluniformmeasureon.Hence,Theorem6followsimmedi-atelyfromTheorem10.
IneffectmatricessatisfyingtheCSconditionsaresoubiq-uitousthatitisreasonabletogeneratethembysamplingatrandomfromauniformprobabilitydistribution.
TheproofofTheorem10isconductedoverSectionsVII-A–C;itproceedsbystudyingevents
,,whereCS1Holds,etc.Itwillbeshownthatforparametersand
thendefiningand,wehave
Since,whenoccurs,ourrandomdrawhasproducedama-trixobeyingCS1–CS3withparameters
and,thisprovesTheorem10.Theproofactuallyshowsthatforsome
,,;theconvergenceisexponen-tiallyfast.
A.ControlofMinimalEigenvalue
Thefollowinglemmaallowsustochoosepositiveconstants
andsothatconditionCS1holdswithoverwhelmingprobability.
Lemma7.1:Considersequencesofwith
.Definetheevent
Then,foreach,forsufficientlysmall
Theproofinvolvesthreeideas.ThefirstideaisthattheeventofinterestforLemma7.1isrepresentableintermsofeventsindexedbyindividualsubsets
Ourplanistoboundtheprobabilityofoccurrenceofevery.
Thesecondideaisthatforaspecificsubset,wegetlargede-viationsboundsontheminimumeigenvalue;thiscanbestatedasfollows.
Lemma7.2:For
,letdenotetheeventthattheminimumeigenvalueexceeds.
Thenthereis
sothatforsufficientlysmallandall
uniformlyin.
Thiswasderivedin[11]andin[35],usingtheconcentrationofmeasurepropertyofsingularvaluesofrandommatrices,e.g.,seeSzarek’swork[41],[42].
1301
Thethirdandfinalideaisthatboundsforindividualsubsetscancontrolsimultaneousbehavioroverall.Thisisexpressedasfollows.
Lemma7.3:Supposewehaveevents
allobeying,for
somefixed
andforeach
with
.Pick
with
and
with
.Thenforall
sufficientlylarge
forsome
with
Ourmaingoalofthissubsection,Lemma7.1,nowfollowsbycombiningthesethreeideas.
ItremainsonlytoproveLemma7.3.Let
with
Wenotethat,byBoole’sinequality
thelastinequalityfollowingbecauseeachmemberisof
cardinality
,since,assoonas.Also,ofcourse
soweget
.Takingasgiven,wegetthe
desiredconclusion.
B.SphericalSectionsProperty
WenowshowthatconditionCS2canbemadeoverwhelm-inglylikelyforlargebychoiceofandsufficientlysmallbutstillpositive.Ourapproachderivesfrom[11],whichappliedanimportantresultfromBanachspacetheory,thealmostspher-icalsectionsphenomenon.ThissaysthatslicingtheunitballinaBanachspacebyintersectionwithanappropriatefinite-dimen-sionallinearsubspacewillresultinaslicethatiseffectivelyspherical[43],[44].Wedevelopaquantitativerefinementof
thisprincipleforthe
normin,showingthat,withover-whelmingprobability,everyoperator
foraffordsasphericalsectionoftheball.ThebasicargumentweuseoriginatesfromworkofMilman,Kashin,andothers[44],[10],[45];werefineanargumentinPisier[19]and,asin[11],drawinferencesthatmaybenovel.Weconcludethatnotonlydoalmost-sphericalsectionsexist,buttheyaresoubiquitousthat
every
withwillgeneratethem.Definition7.1:Let
.Wesaythatoffersan-isom-
etrybetweenand
if
(VII.1)
1302ThefollowinglemmashowsthatconditionCS2isagenericpropertyofmatrices.
Lemma7.4:Considertheeventthateverywithoffersan-isometrybetweenand.Foreach,thereissothat
Toprovethis,wefirstneedalemmaaboutindividualsubsetsprovenin[11].Lemma7.5:Fix.Choose
sothat
(VII.2)
and
(VII.3)
Choose
sothat
andletdenotethedifferencebetweenthetwosides.For
asubsetin
,letdenotetheeventthatfurnishesan-isometryto.Thenas
NownotethattheeventofinterestforLemma7.4istofinish,applytheindividualLemma7.5togetherwiththecom-biningprincipleinLemma7.3.C.QuotientNormInequalities
Wenowshowthat,for,forsufficientlysmall,nearlyalllargebymatriceshavepropertyCS3.Ourargumentborrowsheavilyfrom[11]whichthereadermayfindhelpful.Weheremakenoattempttoprovideintuitionortocom-parewithcloselyrelatednotionsinthelocaltheoryofBanachspaces(e.g.,Milman’sQuotientofaSubspaceTheorem[19]).
Letbeanycollectionofindicesin
;isalinearsubspaceof,andonthissubspaceasubsetofpossiblesignpatternscanberealized,i.e.,sequencesof’sgeneratedby
CS3willfollowifwecanshowthatforevery,
someapproximationtosatisfiesfor.
Lemma7.6:UniformSign-PatternEmbedding.Fix
.
Thenfor
,set(VII.4)
Forsufficientlysmall
,thereisaneventwith,as.Onthisevent,foreachsubsetwith,foreachsignpatternin,there
isavector
with(VII.5)
IEEETRANSACTIONSONINFORMATIONTHEORY,VOL.52,NO.4,APRIL2006
and
(VII.6)
Inwords,asmallmultipleofanysignpatternalmost
livesinthedualball
.Beforeprovingthisresult,weindicatehowitgivestheprop-ertyCS3;namely,that
,andimply
(VII.7)
Considertheconvexoptimizationproblem
subjectto
(VII.8)
Thiscanbewrittenasalinearprogram,bythesamesortofcon-structionasgivenfor(IV.1).Bythedualitytheoremforlinearprogramming,thevalueoftheprimalprogramisatleastthevalueofthedual
subjectto
Lemma7.6givesusasupplyofdual-feasiblevectorsandhence
alowerboundonthedualprogram.Take
;wecanfindwhichisdual-feasibleandobeys
pickingappropriatelyandtakingintoaccountthesphericalsectionstheorem,forsufficientlylarge,wehave
;(VII.7)followswith.
1)ProofofUniformSign-PatternEmbedding:TheproofofLemma7.6followscloselyasimilarresultin[11]thatconsid-eredthecase
.Ourideahereistoadapttheargumentforthe
casetothecase,withchangesreflectingthedifferentchoiceof,,andthe
sparsitybound
.Weleaveoutlargepartsofthear-gument,astheyareidenticaltothecorrespondingpartsin[11].Thebulkofoureffortgoestoproducethefollowinglemma,whichdemonstratesapproximateembeddingofasinglesignpatterninthedualball.
Lemma7.7:IndividualSign-PatternEmbedding.Let
,let,with,,,asinthestatementof
Lemma7.6.Let
.Givenacollection,thereisaniterativealgorithm,describedbelow,producingavectorasoutputwhichobeys
(VII.9)
Letbei.i.d.uniformon;thereisaneventdescribedbelow,havingprobabilitycontrolledby
(VII.10)
for
whichcanbeexplicitlygivenintermsofand.Onthisevent
(VII.11)
Lemma7.7willbeproveninasectionofitsown.WenowshowthatitimpliesLemma7.6.
DONOHO:COMPRESSEDSENSINGWerecallastandardimplicationofso-calledVapnik–Cervonenkistheory[46]Noticethatif,then
whilealso
Hence,thetotalnumberofsignpatternsgeneratedbyoperators
obeys
Now
furnishedbyLemma7.7ispositive,sopick
with.Define
wheredenotestheinstanceoftheevent(calledinthestatementofLemma7.7)generatedbyaspecific,combi-nation.Ontheevent
,everysignpatternassociatedwithanyobeyingisalmostdualfeasible.Now
whichtendstozeroexponentiallyrapidly.D.ProofofIndividualSign-PatternEmbedding
1)AnEmbeddingAlgorithm:Thecompanionpaper[11]in-troducedanalgorithmtocreateadualfeasiblepointstartingfromanearbyalmost-feasiblepoint.Itworkedasfollows.
Letbethecollectionofindices
withandthenset
where
denotestheleast-squaresprojector
.Ineffect,identifytheindiceswhere
exceedshalftheforbiddenlevel,and“kill”thoseindices.
Continuethisprocess,producing,,etc.,withstage-de-pendentthresholds
successivelycloserto.Setand,putting,
1303
Ifisempty,thentheprocessterminates,andset
.
Terminationmustoccuratstage
.AtterminationHence,isdefinitelydualfeasible.Theonlyquestionishow
closeto
itis.2)AnalysisFramework:Alsoin[11]boundsweredevel-opedfortwokeydescriptorsofthealgorithmtrajectoryand
Weadapttheargumentsdeployedthere.Wedefinebounds
andforand,oftheform
hereand
,wherewill
bedefinedlater.Wedefinesubevents
Nowdefine
thiseventimplies,because
Wewillshowthat,for
choseninconjunctionwith
(VII.12)
Thisimplies
andthelemmafollows.
3)LargeDeviations:Definetheevents
sothat
Put
andnotethatthisdependsquiteweaklyon.Recallthatthe
event
isdefinedintermsofand.Ontheevent,.Lemma7.1implicitlydefinedaquan-tity
lowerboundingtheminimumeigenvalueof1304everywhere
.Picksothat
.Pick
sothat
Withthischoiceof,whentheeventoccurs,
Also,on,(say)for.
In[11],ananalysisframeworkwasdevelopedinwhicha
family
ofrandomvariablesi.i.d.wasintroduced,anditwasshownthat
and
Thatpaperalsostatedtwosimplelargedeviationsbounds.Lemma7.8:Let
bei.i.d.
,
,
and
Applyingthis,wenotethattheevent
statedintermsof
variables,isequivalenttoanevent
statedintermsofstandardrandomvariables,where
and
Wethereforehavefortheinequality
Now
and
IEEETRANSACTIONSONINFORMATIONTHEORY,VOL.52,NO.4,APRIL2006
Since,thetermofmostconcerninis
at
;theothertermsarealwaysbetter.Alsoinfactdoesnotdependon.Focusingnowon
,wemaywriteRecallingthatandputting
weget
for,and
Asimilaranalysisholdsforthe’s.
VIII.CONCLUSION
A.Summary
Wehavedescribedanabstractframeworkforcompressed
sensingofobjectswhichcanberepresentedasvectors
.Weassumetheobject
ofinterestisaprioricompressiblesothatforaknownbasisorframeand.
Startingfromanbymatrixwith
satisfyingcondi-tionsCS1–CS3,andwiththematrixofanorthonormalbasis
ortightframeunderlying
,wedefinetheinformationoperator.Startingfromthe-tupleofmeasured
information
,wereconstructanapproximationtobysolving
subjectto
Theproposedreconstructionruleusesconvexoptimizationandiscomputationallytractable.Also,theneededmatricessatis-fyingCS1–CS3canbeconstructedbyrandomsamplingfromauniformdistributiononthecolumnsof.
Wegiveerrorboundsshowingthatdespitetheapparentun-dersampling
,goodaccuracyreconstructionispossibleforcompressibleobjects,andweexplainthenear-optimalityoftheseboundsusingideasfromOptimalRecoveryandInforma-tion-BasedComplexity.Weevenshowthattheresultsarestableundersmallmeasurementerrorsinthedata(
small).Potentialapplicationsaresketchedrelatedtoimagingandspectroscopy.B.AlternativeFormulation
WeremarkthattheCS1–CS3conditionsarenottheonlywaytoobtainourresults.OurproofofTheorem9reallyshowsthefollowing.
Theorem11:Supposethatanmatrixobeysthe
followingconditions,withconstants
,
and
.
DONOHO:COMPRESSEDSENSINGA1:
Themaximalconcentration(definedin
SectionIV-B)obeys
(VIII.1)
A2:Thewidth(definedinSectionII)obeys
(VIII.2)
Let.Forsomeandall,thesolutionofobeystheestimate
Inshort,adifferentapproachmightexhibitoperatorswithgoodwidthsoverballsonly,andlowconcentrationon“thin”sets.AnotherwaytoseethattheconditionsCS1–CS3cannodoubtbeapproacheddifferentlyistocomparetheresultsin[11],[35];thesecondpaperprovesresultswhichpartiallyoverlapthoseinthefirst,byusingadifferenttechnique.C.ThePartialFourierEnsemble
Webrieflydiscusstworecentarticleswhichdonotfitinthe-widthstraditionfollowedhere,andsowerenoteasytociteearlierwithdueprominence.
First,andclosesttoourviewpoint,isthebreakthroughpaperofCandès,Romberg,andTao[13].Thiswasdiscussedin
SectionIV-B;theresultof[13]showedthat
minimizationcanbeusedtoexactlyrecoversparsesequencesfromtheFouriertransformatrandomlychosenfrequencies,whenever
thesequencehasfewerthan
nonzeros,forsome.SecondisthearticleofGilbertetal.[12],which
showedthatadifferentnonlinearreconstructionalgorithmcan
beusedtorecoverapproximationstoavectorin
whichisnearlyasgoodasthebest-termapproximationin
norm,usingabout
randombutnonuniformsamplesinthefrequencydomain;here
is(itseems)anupperboundonthenormof.
ThesepapersbothpointtothepartialFourierensemble,i.e.,
thecollectionof
matricesmadebysamplingrowsoutofthe
Fouriermatrix,asconcreteexamplesofworkingwithintheCSframework;thatis,generatingnear-optimalsub-spacesforGel’fand-widths,andallowingminimizationto
reconstructfromsuchinformationforall
.Now[13](ineffect)provesthatif
,theninthepartialFourierensemblewithuniformmeasure,themaximalconcentrationconditionA1(VIII.1)holdswithoverwhelming
probabilityforlarge(forappropriateconstants
,).Ontheotherhand,theresultsin[12]seemtoshowthatconditionA2(VIII.2)holdsinthepartialFourierensemblewithoverwhelmingprobabilityforlarge,whenitissampledwithacertainnonuniformprobabilitymeasure.Althoughthetwopa-pers[13],[12]refertodifferentrandomensemblesofpartialFouriermatrices,bothreinforcetheideathatinterestingrela-tivelyconcretefamiliesofoperatorscanbedevelopedforcom-pressedsensingapplications.Infact,CandèshasinformedusofsomerecentresultsheobtainedwithTao[47]indicatingthat,modulopolylogfactors,A2holdsfortheuniformlysampledpartialFourierensemble.Thisseemsaverysignificantadvance.
1305
NoteAddedinProof
Inthemonthssincethepaperwaswritten,severalgroupshaveconductednumericalexperimentsonsyntheticandrealdataforthemethoddescribedhereandrelatedmethods.Theyhaveex-ploredapplicabilitytoimportantsensorproblems,andstudiedapplicationsissuessuchasstabilityinthepresenceofnoise.ThereadermaywishtoconsulttheforthcomingSpecialIssueonSparseRepresentationoftheEURASIPjournalAppliedSignalProcessing,orlookforpaperspresentedataspecialsessioninICASSP2005,ortheworkshoponsparserepresentationheldinMay2005attheUniversityofMarylandCenterforScien-tificComputingandAppliedMathematics,ortheworkshopinNovember2005atSpars05,UniversitédeRennes.
ArefereehaspointedoutthatCompressedSensingisinsomerespectssimilartoproblemsarisingindatastreamprocessing,whereonewantstolearnbasicproperties[e.g.,moments,his-togram]ofadatastreamwithoutstoringthestream.Inshort,onewantstomakerelativelyfewmeasurementswhileinfer-ringrelativelymuchdetail.Thenotionsof“Icebergqueries”inlargedatabasesand“heavyhitters”indatastreamsmayprovidepointsofentryintothatliterature.
ACKNOWLEDGMENT
Inspring2004,EmmanuelCandèstoldthepresentauthorabouthisideasforusingthepartialFourierensemblein“under-sampledimaging”;someofthesewerepublishedin[13];seealsothepresentation[14].Morerecently,Candèsinformedthepresentauthoroftheresultsin[47]referredtoabove.Itisaplea-suretoacknowledgetheinspiringnatureoftheseconversations.TheauthorwouldalsoliketothankAnnaGilbertfortellinghimaboutherwork[12]findingtheB-bestFouriercoefficientsbynonadaptivesampling,andtothankEmmanuelCandèsforconversationsclarifyingGilbert’swork.Thankstotherefereesfornumeroussuggestionswhichhelpedtoclarifytheexpositionandargumentation.AnnaGilbertofferedhelpfulpointerstothedatastreamprocessingliterature.
REFERENCES
[1]D.L.Donoho,M.Vetterli,R.A.DeVore,andI.C.Daubechies,“Data
compressionandharmonicanalysis,”IEEETrans.Inf.Theory,vol.44,no.6,pp.2435–2476,Oct.1998.
[2]Y.Meyer,WaveletsandOperators.Cambridge,U.K.:Cambridge
Univ.Press,1993.
[3]D.L.Donoho,“Unconditionalbasesareoptimalbasesfordatacompres-sionandforstatisticalestimation,”Appl.Comput.HarmonicAnal.,vol.1,pp.100–115,1993.[4],“Sparsecomponentsofimagesandoptimalatomicdecomposi-tion,”ConstructiveApprox.,vol.17,pp.353–382,2001.
[5]A.Pinkus,“n-widthsandoptimalrecoveryinapproximationtheory,”in
Proc.Symp.AppliedMathematics,vol.36,C.deBoor,Ed.,Providence,RI,1986,pp.51–66.
[6]J.F.TraubandH.Woziakowski,AGeneralTheoryofOptimalAlgo-rithms.NewYork:Academic,1980.
[7]D.L.Donoho,“Unconditionalbasesandbit-levelcompression,”Appl.
Comput.HarmonicAnal.,vol.3,pp.388–92,1996.
[8]A.Pinkus,n-WidthsinApproximationTheory.NewYork:Springer-Verlag,1985.
[9]A.Y.GarnaevandE.D.Gluskin,“OnwidthsoftheEuclideanball”(in
English),Sov.Math.–Dokl.,vol.30,pp.200–203,1984.
[10]B.S.Kashin,“Diametersofcertainfinite-dimensionalsetsinclassesof
smoothfunctions,”Izv.Akad.NaukSSSR,Ser.Mat.,vol.41,no.2,pp.334–351,1977.
1306[11]D.L.Donoho,“Formostlargeunderdeterminedsystemsoflinear
equations,theminimal`-normsolutionisalsothesparsestsolution,”Commun.PureAppl.Math.,tobepublished.
[12]A.C.Gilbert,S.Guha,P.Indyk,S.Muthukrishnan,andM.Strauss,
“Near-optimalsparsefourierrepresentationsviasampling,”inProc34thACMSymp.TheoryofComputing,Montréal,QC,Canada,May2002,pp.152–161.
[13]E.J.Candès,J.Romberg,andT.Tao,“Robustuncertaintyprinciples:
Exactsignalreconstructionfromhighlyincompletefrequencyinforma-tion.,”IEEETrans.Inf.Theory,tobepublished.
[14]E.J.Candès,“RobustUncertaintyPrinciplesandSignalRecovery,”
presentedatthe2ndInt.Conf.ComputationalHarmonicAnaysis,Nashville,TN,May2004.
[15]C.A.MicchelliandT.J.Rivlin,“Asurveyofoptimalrecovery,”inOp-timalEstimationinApproximationTheory,C.A.MicchelliandT.J.Rivlin,Eds:Plenum,1977,pp.1–54.
[16]J.PeetreandG.Sparr,“Interpolationofnormedabeliangroups,”Ann.
Math.PureAppl.,ser.4,vol.92,pp.217–262,1972.
[17]J.BerghandJ.Löfström,InterpolationSpaces.AnIntroduc-tion.Berlin,Germany:Springer-Verlag,1976.
[18]B.Carl,“Entropynumberss-numbers,andeigenvalueproblems,”J.
Funct.Anal.,vol.41,pp.290–306,1981.
[19]G.Pisier,TheVolumeofConvexBodiesandBanachSpaceGeom-etry.Cambridge,U.K.:CambridgeUniv.Press,19.
[20]C.Schütt,“Entropynumbersofdiagonaloperatorsbetweensymmetric
Banachspaces,”J.Approx.Theory,vol.40,pp.121–128,1984.
[21]T.Kuhn,“Alowerestimateforentropynumbers,”J.Approx.Theory,
vol.110,pp.120–124,2001.
[22]S.GalandC.Micchelli,“Optimalsequentialandnonsequentialproce-duresforevaluatingafunctional,”Appl.Anal.,vol.10,pp.105–120,1980.
[23]A.A.MelkmanandC.A.Micchelli,“Optimalestimationoflinearoper-atorsfrominaccuratedata,”SIAMJ.Numer.Anal.,vol.16,pp.87–105,1979.
[24]M.A.KonandE.Novak,“Theadaptionproblemforapproximating
linearoperators,”Bull.Amer.Math.Soc.,vol.23,pp.159–165,1990.[25]E.Novak,“Onthepowerofadaption,”J.Complexity,vol.12,pp.
199–237,1996.
[26]S.Chen,D.L.Donoho,andM.A.Saunders,“Atomicdecompositionby
basispursuit,”SIAMJ.SciComp.,vol.20,no.1,pp.33–61,1999.[27]D.L.DonohoandX.Huo,“Uncertaintyprinciplesandidealatomicde-composition,”IEEETrans.Inf.Theory,vol.47,no.7,pp.2845–62,Nov.2001.
[28]M.EladandA.M.Bruckstein,“Ageneralizeduncertaintyprincipleand
sparserepresentationsinpairsofbases,”IEEETrans.Inf.Theory,vol.49,no.9,pp.2558–2567,Sep.2002.
[29]D.L.DonohoandM.Elad,“Optimallysparserepresentationfromover-completedictionariesvia`normminimization,”Proc.Natl.Acad.Sci.USA,vol.100,no.5,pp.2197–2002,Mar.2002.
IEEETRANSACTIONSONINFORMATIONTHEORY,VOL.52,NO.4,APRIL2006
[30]R.GribonvalandM.Nielsen,“Sparserepresentationsinunionsof
bases,”IEEETransInfTheory,vol.49,no.12,pp.3320–3325,Dec.2003.
[31]J.J.Fuchs,“Onsparserepresentationinarbitraryredundantbases,”
IEEETrans.Inf.Theory,vol.50,no.6,pp.1341–1344,Jun.2002.[32]J.A.Tropp,“Greedisgood:Algorithmicresultsforsparseapproxima-tion,”IEEETransInf.Theory,vol.50,no.10,pp.2231–2242,Oct.2004.[33],“Justrelax:Convexprogrammingmethodsforidentifyingsparse
signalsinnoise,”IEEETransInf.Theory,vol.52,no.3,pp.1030–1051,Mar.2006.
[34]D.L.Donoho,M.Elad,andV.Temlyakov,“Stablerecoveryofsparse
overcompleterepresentationsinthepresenceofnoise,”IEEETrans.Inf.Theory,vol.52,no.1,pp.6–18,Jan.2006.
[35]D.L.Donoho,“Formostunderdeterminedsystemsoflinearequations,
theminimal`-normnear-solutionapproximatesthesparsestnear-so-lution,”Commun.PureAppl.Math.,tobepublished.
[36]I.C.Daubechies,TenLecturesonWavelets.Philadelphia,PA:SIAM,
1992.
[37]S.Mallat,AWaveletTourofSignalProcessing.SanDiego,CA:Aca-demic,1998.
[38]A.Cohen,R.DeVore,P.Petrushev,andH.Xu,“Nonlinearapproxima-tionandthespaceBV(R),”Amer.J.Math.,vol.121,pp.587–628,1999.
[39]E.J.CandèsandD.L.Donoho,“Curvelets—Asurprisinglyeffective
nonadaptiverepresentationforobjectswithedges,”inCurvesandSur-faces,C.Rabut,A.Cohen,andL.L.Schumaker,Eds.Nashville,TN:VanderbiltUniv.Press,2000.
[40]
,“Newtightframesofcurveletsandoptimalrepresentationsofob-jectswithpiecewiseCsingularities,”Comm.PureAppl.Math.,vol.LVII,pp.219–266,2004.
[41]S.J.Szarek,“Spaceswithlargedistancesto`andrandommatrices,”
Amer.J.Math.,vol.112,pp.819–842,1990.[42],“Conditionnumbersofrandommatrices,”J.Complexity,vol.7,
pp.131–149,1991.
[43]A.Dvoretsky,“Someresultsonconvexbodiesandbanachspaces,”in
Proc.Symp.LinearSpaces,Jerusalem,Israel,1961,pp.123–160.
[44]T.Figiel,J.Lindenstrauss,andV.D.Milman,“Thedimensionofalmost-sphericalsectionsofconvexbodies,”ActaMath.,vol.139,pp.53–94,1977.
[45]V.D.MilmanandG.Schechtman,AsymptoticTheoryofFinite-Dimen-sionalNormedSpaces(LeectureNotesinMathematics).Berlin,Ger-many:Springer-Verlag,1986,vol.1200.
[46]D.Pollard,EmpiricalProcesses:TheoryandApplications.Hayward,
CA:Inst.Math.Statist.,vol.2,NSF-CBMSRegionalConferenceSeriesinProbabilityandStatistics.
[47]E.J.CandèsandT.Tao,“Near-optimalsignalrecoveryfromrandom
projections:Universalencodingstrategies,”AppliedandComputationalMathematics,Calif.Inst.Technol.,Tech.Rep.,2004.
因篇幅问题不能全部显示,请点此查看更多更全内容
Copyright © 2019- baijiahaobaidu.com 版权所有 湘ICP备2023023988号-9
违法及侵权请联系:TEL:199 18 7713 E-MAIL:2724546146@qq.com
本站由北京市万商天勤律师事务所王兴未律师提供法律服务