Quick Start

In this tutorial you will build a complete pipeline that takes an EPUB file, extracts its metadata, and enriches it with AI-generated content. By the end, you will have a working pipeline you can run on any book file.

1. Sign Up and Create an Organization

Open Zedoc and create an account using your email address. Once you are signed in, you will be prompted to create your first organization. Give it a name — something like your company name or team name works well.

You should see your new organization’s dashboard, with an empty list of pipelines.

2. Create a New Pipeline

Click the New Pipeline button. Give your pipeline a name — for this tutorial, call it “Metadata Extraction.”

You should see the pipeline editor: a blank canvas with a task palette on the side.

3. Add the “Extract File Metadata” Task

Browse the task palette and find Extract File Metadata. Click it to add it to the canvas.

This task reads embedded metadata from a book file — things like the title, author, ISBN, and publisher.

You should see the task appear on the canvas with its input and output ports visible.

4. Add the “Extract Contents” Task

Now add the Extract Contents task from the palette. This task pulls the structured text content out of an EPUB file so other tasks can work with it.

You should see both tasks on the canvas, side by side.

5. Add the “Enrich Metadata” Task

Add one more task: Enrich Metadata. This task uses AI to enhance the book’s metadata with marketing descriptions, summaries, and subject classifications.

You now have three tasks on your canvas.

6. Connect the Tasks

Now connect the tasks so data flows between them:

  • Drag from the Book File output port on the pipeline input to the Book File input port on Extract File Metadata.
  • Drag from the same Book File pipeline input to the Book File input port on Extract Contents.
  • Drag from the Book Metadata output on Extract File Metadata to the Book Metadata input on Enrich Metadata.
  • Drag from the Structured Text output on Extract Contents to the Structured Text input on Enrich Metadata.

The editor only allows connections between compatible types, so if a connection does not snap into place, check that you are linking the correct ports.

You should see lines connecting all three tasks, forming a flow from left to right.

7. Set Up the Pipeline Input

Your pipeline needs to know what data to ask for when you start a run. Connect the pipeline’s input area to the first tasks so that it asks the user for an EPUB file.

If this was not already created when you connected the tasks in the previous step, add a pipeline input of type Book File and connect it to the inputs of both Extract File Metadata and Extract Contents.

You should see a pipeline input labeled with the Book File type on the left side of the canvas.

8. Set Up the Pipeline Output

Connect the output of Enrich Metadata to the pipeline’s output area. This tells Zedoc what the final deliverable of this pipeline is — in this case, the enriched metadata.

You should see the pipeline output on the right side of the canvas, connected to the Enrich Metadata task.

9. Run the Pipeline

Now it is time to see your pipeline in action. Click the Run button in the top right corner.

A dialog will appear asking you to provide the pipeline inputs. Upload an EPUB file — any EPUB you have on hand will work.

Click Start Run to begin.

10. Watch the Progress

You should see the run view, where each task’s status updates in real time. Tasks will move through states: waiting, running, and completed. Because Extract File Metadata and Extract Contents do not depend on each other, they may run at the same time.

Once all three tasks have completed, your run is finished.

11. Review the Results

Click on any completed task to inspect its outputs. The Enrich Metadata task’s output will show the enriched book metadata, including any AI-generated descriptions and classifications.

You can also check the pipeline output to see the final deliverables.

Your first pipeline is complete. You have built a working metadata extraction workflow that you can run again on any book file.

Next Steps

  • Learn more about Pipelines and how they work
  • Explore the Semantic Type System that keeps your connections valid
  • Browse all available Tasks to see what you can add to your pipelines