Extract data from the "Share and Multiply" dataset for use with MGen.
|
|
1 rok temu | |
|---|---|---|
| hmm | 1 rok temu | |
| src | 1 rok temu | |
| Cargo.toml | 2 lat temu | |
| README.md | 1 rok temu |
This repo contains tools to extract empirical distributions from the "Share and Multiply" (SaM) dataset of WhatsApp chat metadata.
More thorough documentation is coming soon, but the gist is:
json_files.zip file they provide, and extract it somewhere.extract tool to pare and serialize the SaM data.hmm to label messages as "active" or "idle".process tool to generate all empirical distributions other than message sizes.message-lens tool to generate distributions for message sizes.