-
Notifications
You must be signed in to change notification settings - Fork 449
Open
Description
Hi!
There seems to be memory leak issue in libpostal_parse_address
. The memory usage will increase over time when parsing the same address.
My country is
This issue is not specific to any country or address. I tried using other addresses or random strings, but the issue still remains.
Here's how I'm using libpostal
The program parses the example address 10M times and use Linux pmap
to print its memory usage.
// gcc -o app app.c $(pkg-config --cflags --libs libpostal)
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <libpostal/libpostal.h>
int main(int argc, char **argv) {
if (!libpostal_setup() || !libpostal_setup_parser()) {
exit(EXIT_FAILURE);
}
libpostal_address_parser_options_t options = libpostal_get_address_parser_default_options();
int count = 10000000;
int batch = 100000;
for (int i = 0; i < count; i++) {
libpostal_address_parser_response_t *parsed = libpostal_parse_address("781 Franklin Ave Crown Heights Brooklyn NYC NY 11216 USA", options);
libpostal_address_parser_response_destroy(parsed);
if (i % batch == 0)
{
char command[256];
sprintf(command, "pmap -x %d > %d.txt", getpid(), i / batch + 1);
puts(command);
system(command);
}
}
libpostal_teardown();
libpostal_teardown_parser();
}
Here's what I did
See above.
Here's what I got
The memory usage increases over time.
echo "File Kbytes RSS Dirty"; for i in {5..100..5}; do echo -n "$i.txt: " && cat $i.txt | grep total; done
File Kbytes RSS Dirty
5.txt: total kB 1942360 1924872 1921816
10.txt: total kB 2007900 1960788 1957732
15.txt: total kB 2007900 1980316 1977260
20.txt: total kB 2073436 1999848 1996792
25.txt: total kB 2073436 2019380 2016324
30.txt: total kB 2073436 2038912 2035856
35.txt: total kB 2204508 2058444 2055388
40.txt: total kB 2204508 2077972 2074916
45.txt: total kB 2204508 2097504 2094448
50.txt: total kB 2204508 2117036 2113980
55.txt: total kB 2204508 2136568 2133512
60.txt: total kB 2204508 2156100 2153044
65.txt: total kB 2204508 2175632 2172576
70.txt: total kB 2466652 2195160 2192104
75.txt: total kB 2466652 2214692 2211636
80.txt: total kB 2466652 2234224 2231168
85.txt: total kB 2466652 2253756 2250700
90.txt: total kB 2466652 2273288 2270232
95.txt: total kB 2466652 2292816 2289760
100.txt: total kB 2466652 2312348 2309292
I also use valgrind to run 1M times but it does not report memory leak.
valgrind ./app2
==3615986== Memcheck, a memory error detector
==3615986== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==3615986== Using Valgrind-3.15.0 and LibVEX; rerun with -h for copyright info
==3615986== Command: ./app2
==3615986==
==3615986== Warning: set address range perms: large range [0x2f85a040, 0x3fa385f0) (undefined)
==3615986== Warning: set address range perms: large range [0x3fa39040, 0x4fc175f0) (undefined)
==3615986== Warning: set address range perms: large range [0x3fa391ca, 0x4fc171ca) (defined)
==3615986== Warning: set address range perms: large range [0x3fa39028, 0x4fc17608) (noaccess)
==3615986== Warning: set address range perms: large range [0x6577c040, 0x82a05c8c) (undefined)
==3615986== Warning: set address range perms: large range [0x2f85a028, 0x3fa38608) (noaccess)
==3615986== Warning: set address range perms: large range [0x6577c028, 0x82a05ca4) (noaccess)
==3615986==
==3615986== HEAP SUMMARY:
==3615986== in use at exit: 0 bytes in 0 blocks
==3615986== total heap usage: 71,539,052 allocs, 71,539,052 frees, 7,820,286,857 bytes allocated
==3615986==
==3615986== All heap blocks were freed -- no leaks are possible
==3615986==
==3615986== For lists of detected and suppressed errors, rerun with: -s
==3615986== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
Here's what I was expecting
The memory usage should not increase overtime.
For parsing issues, please answer "yes" or "no" to all that apply.
This is not parsing issues.
Here's what I think could be improved
See above.
More information:
- libpostal git version:
8f2066b1d30f4290adf59cacc429980f139b8545
- OS: Ubuntu 20.04.6 LTS 5.4.0-192-generic
HsinWeiHuang
Metadata
Metadata
Assignees
Labels
No labels