During retrieval, it first fetches the small chunks but then looks up the parent ids for those chunks and returns those larger documents.
You may want to have small documents, so that their embeddings can most accurately reflect their meaning. If too long, then the embeddings can lose meaning.
You want to have long enough documents that the context of each chunk is retained.
ParentDocumentRetriever strikes that balance by splitting and storing small chunks of data.
Note that "parent document" refers to the document that a small chunk originated from. This can either be the whole raw document OR a larger chunk.
Glasp is a social web highlighter that people can highlight and organize quotes and thoughts from the web, and access other like-minded people’s learning.