-
Notifications
You must be signed in to change notification settings - Fork 28
Description
All the Attributes parsed in DwarfSymbolProvider.DwarfCompilationUnit.ReadData are not deduplicated/interned.
For a big binary (in my case with debug info about 900MB) this will cause extreme memory usage.
Within the first 100 compilation units my memory usage rises to 12GB and then it gets stuck there because I ran out of memory.
As a ultra ugly hotfix I added this in DwarfSymbolProvider.ParseCompilationUnits
public class StringInterner
{
// deduplicate strings
// meh https://github.com/dotnet/runtime/issues/21603 https://stackoverflow.com/questions/7760364/how-to-retrieve-actual-item-from-hashsett
ConcurrentDictionary<object, object> stringBank = new ConcurrentDictionary<object, object>();
public object InternObject(object str)
{
if (str == null) return str;
if (stringBank.TryGetValue(str, out var result))
{
return result;
}
stringBank.AddOrUpdate(str, str, (x,y)=> x);
return str;
}
}private static DwarfCompilationUnit[] ParseCompilationUnits(byte[] debugData, byte[] debugDataDescription, byte[] debugStrings, NormalizeAddressDelegate addressNormalizer)
{
using (DwarfMemoryReader debugDataReader = new DwarfMemoryReader(debugData))
using (DwarfMemoryReader debugDataDescriptionReader = new DwarfMemoryReader(debugDataDescription))
using (DwarfMemoryReader debugStringsReader = new DwarfMemoryReader(debugStrings))
{
List<DwarfCompilationUnit> compilationUnits = new List<DwarfCompilationUnit>();
StringInterner interner = new StringInterner();
List<Task> tasksList = new List<Task>();
while (!debugDataReader.IsEnd)
{
DwarfCompilationUnit compilationUnit = new DwarfCompilationUnit(debugDataReader, debugDataDescriptionReader, debugStringsReader, addressNormalizer, interner);
tasksList.Add(Task.Run(() =>
{
// intern all attributes in seperate threads
foreach (var compilationUnitSymbol in compilationUnit.Symbols)
{
compilationUnitSymbol.Attributes =
compilationUnitSymbol.Attributes
.Select(x => new KeyValuePair<DwarfAttribute, DwarfAttributeValue>(x.Key, interner.InternObject(x.Value) as DwarfAttributeValue))
.ToDictionary(x => x.Key, x => x.Value);
}
}));
compilationUnits.Add(compilationUnit);
}
Task.WaitAll(tasksList.ToArray());
return compilationUnits.ToArray();
}
}This keeps my memory usage at the 400th compilation unit down at 7.7GB which is atleast usable.
I originally did the interning in DwarfCompilationUnit.data but that took too much time, the data reading is already the performance bottleneck, better not add anything extra to it.
Moving it out into a seperate thread/task works well for me so far.
One could probably intern the whole attribute instead of just the attribute value, not sure if that would be better, I assume it won't.