...
Another way of profiling pyFF's memory usage is just following RES in top or htop for a long-running pyFF/gunicorn process, that has a 60s refresh interval. I normally use this pipeline
...
| Code Block |
|---|
from lxml import etree, objectify
import pickle
# Create pickled datafile
source = open("edugain.xml", "r", encoding="utf-8")
sink = open("edugain.pkl", "w")
t = objectify.parse(source)
p = pickle.dumps(t).decode('latin1')
sink.write(p)
# Read pickled object back in pyFF
def parse_xml
return pickle.loads(io.encode('latin1'))
In metadata parser:
t = parse_xml(content) #Instead of parse_xml(unicode_stream(content)) |
Using un/pickling, pyFF's gunicorn starts out using ~800Mb of RES that slowly extends to a steady 1.2-1.5G.
...
Using xml.sax parser pyFF's gunicorn starts out using ~800Mb of RES that slowly extends to a steady 1.2-1.5G.
...