Title Dinamička modalna dekompozicija u analizi video zapisa
Title (english) Dynamic mode decomposition in video analysis
Author Luka Karlić
Mentor Zlatko Drmač (mentor)
Committee member Zvonimir Bujanović (predsjednik povjerenstva)
Committee member Tina Bosner (član povjerenstva)
Committee member Mladen Vuković (član povjerenstva)
Committee member Boris Širola (član povjerenstva)
Granter University of Zagreb Faculty of Science (Department of Mathematics) Zagreb
Defense date and country 2024-07-16, Croatia
Scientific / art field, discipline and subdiscipline NATURAL SCIENCES Mathematics
Abstract U ovom radu koristimo dinamičku modalnu dekompoziciju (DMD) kako bismo na efikasan način u stvarnom vremenu odvojili prvi plan od pozadine u videozapisu. Svaku sličicu u videozapisu odvajamo na komponentu niskog ranga (eng. low-rank component) te rijetki dio (eng. sparse). Ova metoda inicijalno je razvijena za potrebe analiziranja nelinearnih dinamičkih sustava, no s obzirom da je pogonjena isključivo podacima (eng. data driven), pronašla je svoje primjene u raznim područjima, u ovom slučaju u računalnom vidu. DMD raščlanjuje originalni dinamički sustav na modove niskog ranga s poznatim Fourierovim komponentama u vremenu. Time dobivamo dekompoziciju koja uzima u obzir i vremensku i prostornu ovisnost. Strogo gledano, pozadinu čine DMD modovi s Fourierovim koeficijentima blizu ishodištu, dok je prvi plan sastavljen od nešto udaljenijih modova. Za razliku od RPCA, koji je vodeća metoda za separaciju, DMD-ova najzahtjevnija operacija jest samo jedna SVD dekompozicija, čime omogućuje procesuiranje u stvarnom vremenu. Nadalje, opisujemo i poboljšanje standardne DMD metode. Sažeti DMD iskorištava prednosti teorije sažetog uzorkovanja (eng. compressed sensing) i skiciranja matrice (eng. matrix sketching) da bi znatno efikasnije aproksimirao DMD modove iz sažete reprezentacije početnog dinamičkog sustava, dok je preciznost skoro nepromijenjena. Videozapisi prirodno uvode problem velike dimenzionalnosti zbog jako velikog broja piksela, no sažetom reprezentacijom izbjegavamo računati SVD na vrlo velikoj matrici, a k tomu smo i u mogućnosti koristiti videozapise znatno veće rezolucije, čime očuvamo detalje. Metodu u konačnici evaluiramo na mnoštvu javno dostupnih skupova podataka te navodimo brojčane rezultate. Kako bismo mjerili kakvoću, koristimo matricu zabune i četiri različite metrike - preciznost, odziv, F1 mjeru i Matthewseov koeficijent korelacije. Pokazali smo da vrlo uspješno može odvojiti pozadinu i prvi plan, čime otvara mnoge mogućnosti u računalnom vidu s primjenama na videonadzor.
Abstract (english) In this thesis, we use dynamic mode decomposition (DMD) to efficiently separate the foreground from the background in a video in real time. Each frame in the video is separated into a low-rank component and a sparse component. This method was initially developed for the analysis of nonlinear dynamic systems, but since it is purely data-driven, it has found applications in various fields, in this case, in computer vision. DMD decomposes the original dynamic system into low-rank modes with known Fourier components over time. This provides a decomposition that considers both temporal and spatial dependencies. Strictly speaking, the background consists of DMD modes with Fourier coefficients close to the origin, while the foreground is composed of slightly more distant modes. Unlike RPCA, which is the leading method for separation, DMD’s most demanding operation is just a single SVD decomposition, enabling real-time processing. Furthermore, we describe an improvement of the standard DMD method. Compressed DMD leverages the advantages of compressed sensing theory and matrix sketching to significantly more efficiently approximate DMD modes from a compressed representation of the initial dynamic system, while maintaining nearly unchanged accuracy. Videos naturally introduce the problem of high dimensionality due to the very large number of pixels, but with compressed representation, we avoid computing SVD on a very large matrix, and we can also use videos of significantly higher resolution, thereby preserving details. Ultimately, we evaluate the method on a multitude of publicly available datasets and provide numerical results. To measure quality, we use the confusion matrix and four different metrics - precision, recall, F1 score, and Matthews correlation coefficient. We have shown that it can successfully separate the background and foreground, opening many possibilities in computer vision applications for video surveillance.
Keywords
dinamička modalna dekompozicija
DMD
videozapisi
Fourier
Keywords (english)
dynamic mode decomposition
DMD
video
Fourier
Language croatian
URN:NBN urn:nbn:hr:217:680445
Study programme Title: Mathematical Statistics Study programme type: university Study level: graduate Academic / professional title: sveučilišni magistar matematike (sveučilišni magistar matematike)
Type of resource Text
File origin Born digital
Access conditions Open access
Terms of use
Repository Repository of the Faculty of Science
Created on 2024-10-02 11:05:35