--- title: Initial Document Import description: Use the import API for large document imports --- ## Foreword and import statistics Importing millions of documents from legacy systems into Livingdocs takes time. We observed these numbers: - 50k articles per hour - 100k - 300k images per hour During these observations, memory usage was around ~4GB or RAM und 25Mbps of inbout and outbound bandwidth was used. If no images are imported, a lot more documents could be imported. ## Custom document IDs During a migration of an existing system, it is best practice to migrate all entries of the old system into Livingdocs. To ease the migration, we want to support user-defined identifiers, so a custom import script can reuse existing identifiers. To prevent issues with the id generation of Postgres, we will make the maximum allowed id configurable. ## Example SQL to execute to prevent conflicts when new documents are generated: You should replace 100000 with the maximum id of the legacy system you'd like to import documents from. ```sql ALTER SEQUENCE documents_id_seq RESTART WITH 100000; ``` Livingdocs Server Configuration needed to support custom ids: ```js // server configuration documents: { allowCustomIdsBelow: 100000, } ``` Example curl request to import a document with a custom document id: ```js curl -k -X POST "https://server.livingdocs.io/api/2026-05/import/documents" \ -H "Authorization: Bearer ey1234" \ -H "Content-Type: application/json; charset=utf-8" \ --data-binary @- << EOF { // Attention! It's important that the systemName is always the same // for all documents, otherwise the mapping does not work properly "systemName": "import", "documents": [{ "documentId": 1, "id": "123abc", "title": "test import", "contentType": "article", "checksum": "xyz456", "livingdoc": { "content": [], "design": { "name": "living-times", "version": "1.0.1" } }, "metadata": { "description": "foo" } }] } EOF ``` ## Custom publication dates When importing articles from legacy systems, you should be setting the `publicationDate`. The `publicationDate` can be found in the [Public API](/reference/public-api/llms.txt) or [Import API reference documentation](/customising/advanced/import-api/llms.txt). The `publicationDate` controls when an article has been published, updated and is important for the search to function properly. If an article has multiple publication dates and you want to keep a history of for example `created` and `updated`, we advise importing the same article twice. First import the article with the `publicationDate` containing the value of the first time an article was published. Then re-import the article and you basically would 'update' that article with a new `publicationDate` We save the `firstPublicationDate` of an article, so you could access both dates later on in your delivery and show when an article has been published initially and when it was updated.