Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Another way of profiling pyFF's memory usage is just following RES in top or htop for a long-running pyFF/gunicorn process, that has a 60s refresh interval. I normally use this pipeline

...

Code Block
from lxml import etree, objectify
import pickle
# Create pickled datafile
source = open("edugain.xml", "r", encoding="utf-8")
sink = open("edugain.pkl", "w")

t = objectify.parse(source)
p = pickle.dumps(t).decode('latin1')
sink.write(p)

# Read pickled object back in pyFF
def parse_xml
	return pickle.loads(io.encode('latin1'))

In metadata parser:
t = parse_xml(content) #Instead of parse_xml(unicode_stream(content))

Using un/pickling, pyFF's gunicorn starts out using ~800Mb of RES that slowly extends to a steady 1.2-1.5G.

...

Using xml.sax parser pyFF's gunicorn starts out using ~800Mb of RES that slowly extends to a steady 1.2-1.5G.

...