The asset discovery is an event listener that sits on every page and waits for visitors to click links that are considered content assets.
Any behavior on web pages is being recorded in the 'session' and this event listener is continuously checking the session to see if anything happened.
Once he identifies an event, he checks if this is a click on content asset.
By default, assets are defined as any external pages or files (PDF, PPT, PPTX, MP4, OGG, WEBM, YouTube..) but if you configure URL patterns so the engine will look for them as well.
A few example of how to configure the exclude/include rules for Asset URL Patterns:
Page to include/exclude