Resources Matching and Sync

Conceptually, the main purposes of oboku is to sync files across various location into a single place and let you read them.

For the simplicity of the explanation and because the files can be various of various types (mobby-dick.epub, 2026-bill.pdf, /naruto-season-1) we are gonna uses files and book . File being the actual resource on your drive, kobo store, local file system, etc and book being the entity managed by oboku.

In order to sync files into books one of the challenge is to be able to know what is what. For example, what happens if the user move one of its file into a different location. What happens if the user rename a file. What happens if the user has two different files but that are the same book.

Unique Identifier

Application usually solve this problem by using whatever common unique identifier exists. For movies and series you can use the imdb ID, for books you can use ISBN, etc. The idea is to be able to identify a file with something unique.

If the user decides to move a movie file into a different location, or rename the file or actually change it by a higher quality version, the application will be able to re-attach this file behind the correct entity in the application. As long as you detect the file as Harry Potter you can re-attach it with the correct metadata, watch progress etc.

The problem with ISBN & oboku

Inconsistency & Unreliability

Retrieving ISBN from a file is not easy tasks. Sometimes the isbn is wrong, something there are no isbn (my-bill-2026.pdf), sometimes you even get a different ISBN than the one you scan the previous day. Unlike movies or series, one book title can have several dozens of different ISBN, physical, digital, language, revisions, etc. Unless you have a perfect epub book with the isbn written in it, it is by definition highly unreliable.

Strong reliance on file content & structure

Reading progress, bookmarks and many of the information of your books are intrinsically tied to the actual file. Most of these metadata are based on https://idpf.org/epub/linking/cfi/ and will not work if the file content changes. Sometimes it's possible to migrate or self-repair CFIs when the file change but its an entire topic on its own and very much a "might work" system.

If you decide to change a file and we re-attach it to an existing book, there is a chance many of the existing data will not work as expected. This is fine if you loose the progress of your movie, this is more problematic if you loose your page and all your bookmarks on a 600 hundred pages book.

We do use ISBN for metadata such as cover, rating, description, etc. This is a different topic.

Simply no ISBN to be found

Many of our users read fan books or organize their books by collections name that don't matches anything publicly known. If ISBN are possible half of the time and we have to use something else as fallback, why bother and why not use a different unique system at all.

What we use

Identifier based on resource and provider

Instead of an ISBN, the ID we use is built following some rules proper to each providers.

A book synced from Google Drive will use its Google ID. which is globally unique.
A book synced from Synology Drive will use its Synology ID + host
A book synced from filesystem will use the file hash + a filesystem flag.

Thanks to that system we are still able to match files and books to some extends. IF you move a file around on your filesystem, it will work. If you move it around in your Google Drive it will work as well.

Drawback

Because we are using a stricter ID, we are unable to automatically match a file moved from Dropbox to Google Drive as the same book. We are also obviously not able to match a different file that is theoretically for the same book. But it's better to not do something than doing it wrong.

Let the user decide when we don't know

If we are unable to match a file with a book for certain, we will create a new book. If the user wants to merge an existing book with it, he can do so manually through the app. This is usually not a common use case and when it happens, nothing is lost.

PreviousAdding & Synchronize contents NextMetadata Sources

Last updated 18 days ago

hashtagUnique Identifier

hashtagThe problem with ISBN & oboku

hashtagInconsistency & Unreliability

hashtagStrong reliance on file content & structure

hashtagSimply no ISBN to be found

hashtagWhat we use

hashtagIdentifier based on resource and provider

hashtagDrawback

hashtagLet the user decide when we don't know