What are the advantages and disadvantages of using naive bayes for spam detection?

Disadvantages:

Naive bayes is based on the conditional independence of features assumption – an assumption that is not valid in many real world scenarios. Hence it sometimes oversimplifies the problem by saying features are independant and gives sub par performance.

Advantages:

However, naive bayes is very efficient. It is a model you can train in a single iteration (no iteration) – fast to execute. Easily parallelizable. Works where there is less data and lots of features, like bag of words with text data. Its model size/n. Of parameters is small and constant w.r.t data (unlike some others like decision trees), and tends to not to overfit (more likely to underfit than overefit).

Leave a Reply Cancel reply