Model building code. More...
Functions | |
def | spam_filter.make_Dictionary (emails) |
Method to create Dictionary. More... | |
def | spam_filter.extract_features (files) |
Method to extract features from all mails. More... | |
def | spam_filter.mail_features (mail) |
Method to find features of a single mail. More... | |
def | spam_filter.preprocessor (mail) |
Method to pre-process the mails. More... | |
def | spam_filter.find_payload (mail_body, all_words) |
Method to recursively find single part payloads. More... | |
def | spam_filter.split_payload (payload, all_words) |
Method to split the large payloads into smaller chunks. More... | |
def | spam_filter.get_words_plain (content, all_words) |
Method to get words out of plain text content. More... | |
def | spam_filter.get_words_html (content, all_words) |
Method to get words out of html content. More... | |
Model building code.
This code builds and trains a new Machine Learning model
ham_dir | Directory containing ham mails for training |
spam_dir | Directory containing spam mails for training |
def spam_filter.extract_features | ( | files | ) |
Method to extract features from all mails.
mail_dir | The directory containing mails |
def spam_filter.find_payload | ( | mail_body, | |
all_words | |||
) |
Method to recursively find single part payloads.
mail_body | The complete mail body |
all_words | List of all words in the mail |
def spam_filter.get_words_html | ( | content, | |
all_words | |||
) |
Method to get words out of html content.
content | The html content |
all_words | List of all words in the mail |
def spam_filter.get_words_plain | ( | content, | |
all_words | |||
) |
Method to get words out of plain text content.
content | Plain text content |
all_words | List of all words in the mail |
def spam_filter.mail_features | ( | ) |
Method to find features of a single mail.
The address of mail |
def spam_filter.make_Dictionary | ( | emails | ) |
Method to create Dictionary.
train_dir | The directory containing mails |
def spam_filter.preprocessor | ( | ) |
Method to pre-process the mails.
The address of mail |
def spam_filter.split_payload | ( | payload, | |
all_words | |||
) |
Method to split the large payloads into smaller chunks.
payload | The complete payload |
all_words | List of all words in the mail |