Skip to content

normalize start_method spawn seems to ignore environment variables #3353

@furmangg

Description

@furmangg

dlt version

1.18.2

Describe the problem

If I use the following env variables then it writes out to filesystem as .jsonl

os.environ['NORMALIZE__DATA_WRITER__DISABLE_COMPRESSION'] = 'true' 

However, if I do multiple workers for normalize and start method spawn then it writes out as .gz files

os.environ['NORMALIZE__WORKERS'] = '4'
os.environ['NORMALIZE__START_METHOD'] = 'spawn' #https://dlthub.com/docs/reference/performance#normalize
os.environ['NORMALIZE__DATA_WRITER__DISABLE_COMPRESSION'] = 'true'

I also added the following .dlt/config.toml but it still wrote out as .gz when using multiple normalize workers and spawn method:

[normalize.data_writer]
disable_compression=true

Expected behavior

regardless of normalize workers and start method it should produce consistent file formats

Steps to reproduce

See above

Operating system

Linux

Runtime environment

Other

Python version

3.11

dlt data source

REST API

dlt destination

Filesystem & buckets

Other deployment details

No response

Additional information

No response

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

Status

In Progress

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions