Extract data from the "Share and Multiply" dataset for use with MGen.
|
|
1 год назад | |
|---|---|---|
| hmm | 1 год назад | |
| src | 1 год назад | |
| Cargo.toml | 2 лет назад | |
| README.md | 1 год назад |
This repo contains tools to extract empirical distributions from the "Share and Multiply" (SaM) dataset of WhatsApp chat metadata.
More thorough documentation is coming soon, but the gist is:
json_files.zip file they provide, and extract it somewhere.extract tool to pare and serialize the SaM data.hmm to label messages as "active" or "idle".process tool to generate all empirical distributions other than message sizes.message-lens tool to generate distributions for message sizes.