8–19 juin 2009
Fuseau horaire Europe/Paris

Session

Groupe 2: Hands on session 2.5

19 juin 2009, 14:00

Description

· Reverse index (1.5h session)
Description
Practical session. Design and implement a MapReduce-based algorithm to calculate the inverted index over the web crawl data.
Contents
· Count words and elliminate those appearing in most (or all) documents (“and”, “or”, etc.)
· Implement the inverted index. It should not contain the words identified before. Several Map Reduce passes might be necessary.

Documents de présentation

Aucun document.
Ordre du jour en construction...