Configure environment variables for ROCm-RAG

Configure environment variables for ROCm-RAG#

2026-04-28

3 min read time

Applies to Linux

You can configure both extraction and retrieval parameters by setting environment variables for the Docker container in ROCm-RAG installation. There are three ways to set environment variables:

.env file (recommended)

Start with default.env as a base.
Modify the variables as needed and provide the .env file when running the container:
```
docker run --env-file <your env file> ...
```

Docker run

Set variables individually when starting the container:

docker run -e VAR1=value1 -e VAR2=value2 ...

Export in container

Export variables inside the container when running in interactive mode:

export VAR1=value1
export VAR2=value2

Environment variable reference#

The following tables list the configurable environment variables for ROCm-RAG.

Workspace and storage variables

Variable	Description
`ROCM_RAG_WORKSPACE`	ROCm-RAG workspace directory
`ROCM_RAG_HASH_DIR`	Directory to save page-level hash
`ROCM_RAG_VISITED_URL_FILE`	File to save list of scraped URLs

Extraction parameters

Variable	Description
`ROCM_RAG_EXTRACTION_FRAMEWORK`	Extraction RAG framework (`haystack` or `langgraph`)
`ROCM_RAG_HAYSTACK_SERVER_PORT`	Haystack pipeline server port
`ROCM_RAG_LANGGRAPH_SERVER_PORT`	LangGraph server port
`ROCM_RAG_EMBEDDER_MODEL`	Embedder model
`ROCM_RAG_EMBEDDER_API_BASE_URL`	Embedder API base URL
`ROCM_RAG_EMBEDDER_API_PORT`	Embedder API port
`ROCM_RAG_EMBEDDER_MAX_TOKENS`	Embedder model max token limit
`ROCM_RAG_WEAVIATE_URL`	Weaviate DB API base URL
`ROCM_RAG_WEAVIATE_PORT`	Weaviate DB API port
`ROCM_RAG_WEAVIATE_CLASSNAME`	Weaviate classname
`ROCM_RAG_WAIT_VECTOR_DB_TIMEOUT`	Wait time for vector DB server to be ready
`ROCM_RAG_WAIT_EMBEDDER_TIMEOUT`	Wait time for embedder server to be ready
`ROCM_RAG_EMBEDDER_TP`	Tensor parallelism for embedder
`ROCM_RAG_EMBEDDER_GPU_IDS`	List of visible GPUs when deploying embedder model
`ROCM_RAG_START_URLS`	Start URL for scraping
`ROCM_RAG_VALID_EXTENSIONS`	List of supported URL extensions to scrape
`ROCM_RAG_VALID_PAGE_FILTERS`	List of regex filters for selecting valid pages to scrape
`ROCM_RAG_REQUIRE_HUMAN_VERIFICATION_FILTERS`	List of regex filters for identifying pages that require human verification
`ROCM_RAG_PAGE_NOT_FOUND_FILTERS`	List of regex filters for identifying not found pages
`ROCM_RAG_SET_MAX_NUM_PAGES`	Enable limit on the maximum number of pages to scrape
`ROCM_RAG_MAX_NUM_PAGES`	Maximum number of pages to scrape
`ROCM_RAG_MAX_CHUNK_LENGTH`	Maximum number of tokens for SemanticChunkMerger
`ROCM_RAG_SIMILARITY_THRESHOLD`	Similarity threshold for SemanticChunkMerger to merge

Retrieval parameters

Variable	Description
`ROCM_RAG_RETRIEVAL_FRAMEWORK`	Retrieval RAG framework (`haystack` or `langgraph`)
`ROCM_RAG_USE_EXAMPLE_LLM`	Deploy example LLM inference server inside this Docker
`ROCM_RAG_LLM_API_BASE_URL`	LLM API base URL
`ROCM_RAG_LLM_API_PORT`	LLM API port
`ROCM_RAG_LLM_MODEL`	LLM model
`ROCM_RAG_LLM_TP`	Tensor parallelism
`ROCM_RAG_LLM_GPU_IDS`	Visible GPUs, for example, LLM
`ROCM_RAG_HAYSTACK_CERTAINTY_THRESHOLD`	Certainty threshold for retrieval
`ROCM_RAG_HAYSTACK_TOP_K_RANKING`	Top K retrieved documents for Haystack retrieval pipeline
`ROCM_RAG_LANGGRAPH_TOP_K_RANKING`	Top K retrieved documents for LangGraph retrieval pipeline

Configure environment variables for ROCm-RAG

Contents

Configure environment variables for ROCm-RAG#

Environment variable reference#