Skip to content

Schema needed when uploading AVRO file from local filesystem, but not from GCS. #3416

@tweeter0830

Description

@tweeter0830

I'm trying to upload an avro file from my local filesystem using the following code:

  random_table_name = make_random_string()
  table = dataset_.table(name=random_table_name)

  with open(schema_filepath, 'rb') as file_:
    schema = avro.schema.parse(file_.read())

  with tempfile.NamedTemporaryFile() as temp_file:
    with open(temp_file.name, 'wb') as file_:
      writer = avro.datafile.DataFileWriter(
          file_, avro.io.DatumWriter(), schema)
      writer.close()

    with open(temp_file.name, 'rb') as file_:
      job = table.upload_from_file(file_, source_format='AVRO')
    wait_for_job_to_finish(job)

I get the following error:

google.cloud.exceptions.BadRequest: 400 Empty schema specified for the load job. Please specify a schema that describes the data being loaded. (https://www.googleapis.com/upload/bigquery/v2/projects/aircraft-audio-classification/jobs?uploadType=multipart)

I don't need to specify a schema when using the web api or when uploading a file from GCS.

Metadata

Metadata

Assignees

Labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions