Text this: Graph-theoretical techniques for web content mining