List of Tools

MailingListStats

  • URL: http://tools.libresoft.es/mailing_list_stats
  • Author: Israel Herraiz <herraiz at gsyc.escet.urjc.es>.
  • Description: MailingListStats is a tool for mapping mbox files of any mailing list to a database.
  • Technologies: Python, MySQL.
  • Execution mode:
    • Please read the little 'How To' in the README file.
  • Input data: URLs of each mailing list or a directory with the mboxes of each mailing list separated by directories.
  • Output data: Database with information about headers of each parsed mbox and its mailing list.
  • Technology limitation:
    • Each mailing list database cannot be updated without reparsed all mailing list again.
    • Duplicated information is stored in the database when MLS is running using a mbox of a mailing list studied previously.
    • Not fault tolerat. MLS cannot resume its work when the system down unannounced. Is needed to run MLS from the beginning for all mailing lists.
  • Maturity: alpha.
  • Dependencies: python, mysql.
  • Documentation:
  • License: GNU General Public License (GPL).
  • Metrics:
    • Name of the poster.
    • Number of non-active developers.
    • Name of the poster.
    • E-mail address of the poster.
    • Date when the messsage was sent by the poster, and received in the mailing list server.
    • Subject of the message.
    • Mailing list address where the message was sent to.
    • Name and e-mail addresses of other recipients different of the list (for example, other people include in the CC field of the message).
    • Unique identification tag for the message in the mailing list.
    • Identification tag of the original message if the message is a reply to another.
    • Content of the message, including usually attachments.
    • Name of the program used to write the message.
    • Activity (number of messages).
    • Participation (number of people participating).
    • Number of messages over time.
    • Number of people writing in the list over time.
    • SNA methods to study the flow of information within the community. Evolution over time.
    • Mean length of the threads over time. Standard deviation.
    • Statistics of usage of the different programs in the mailing list.
    • List of keywords of the topics discussed in the lists.
    • The list of keywords could be obtained in a monthly basis.
    • With a list of keyword for each month, we can find out which topics were the most discussed and if the topics have evolved.
    • Can be crossed correlated topics with people involved in the discussion of those topics.