Expected Information
where is the probability that tuple in belongs to class which can be estimated by . The Expectation of Surprisal over every possible outome.
It is used to calculate Information Gain and Perplexity.
Python Implementation
def information(dataset: pd.DataFrame, target_attribute: str) -> float:
p = dataset.value_counts(target_attribute) / dataset.shape[0]
return -sum([pi * log(pi, 2) for pi in p])