Results 1 to 6 of 6

Thread: DMOZ Extract by Keyword?

  1. #1
    Join Date
    Sep 2007
    Posts
    9

    Unhappy DMOZ Extract by Keyword?

    Hello,

    Can I extract links from the DMOZ databse by keywords?

    I'm a new user, just installed the DMOZ Extractor. The product description mentioned that you could filter the content from the DMOZ database.

    I was under the impression that this meant by keywords.

    Is this not correct? From what I can tell, I can only extract by category.

    Is there an update, patch or mod that allows keyword extraction?

    Please advise and thank you.

  2. #2
    Join Date
    Jun 2002
    Location
    Winnipeg Canada
    Posts
    4,913

    Default

    A really simple response is no.

    the extreme dmoz extractor extracts categories from the dmoz RDF files.

  3. #3
    Join Date
    Sep 2007
    Posts
    9

    Unhappy

    no patches or mods available at all?

  4. #4
    Join Date
    Jun 2002
    Location
    Winnipeg Canada
    Posts
    4,913

    Default

    The DMOZ extractor is a windows 32bit application. There are no patches or mods available.

    If you would like a directory built on keywords I could help you out. But it would not be a cheap process. I could extract links from a search engine based on a keyword. This would provide you with the following

    Domain
    Link
    Title

    I can provide up to 1000 results per keyword.

    The files sent to you would be importable via the IndexU admin panel. Each keyword would be a category and the links would contain the above information plus the essential data required for the database.

    As for cost, I'm venturing a guess at around $10 per keyword/category depending on the total size.

  5. #5
    Join Date
    Sep 2007
    Posts
    9

    Lightbulb

    Hello Bruceper,

    I'm not sure how you would plan to do it but can you write a quick script to create the two files from the *.txt versions of DMOZ?

    I extracted all links and categories from the DMOZ dump and now have two text files. Can you write a script to do the following from those two text file:

    1. Search the content text file for a given keyword.
    2. Copy all records with the keyword to a new text file [keyword.txt].
    3. Then
    4. Read category ID from new [keyword.txt] file.
    5. Search structure text file for all records with specified category IDs.
    6. Copy all required structure records to a new text file [categories.txt].
    7. Create a quick and dirty UI to input the keyword and the two DMOZ source files. The output files can be named automatically
    8. Then post the script to my webserver for me to test.
    With a simple UI, I can then create other keyword based link files.

    What do you think about this approach?

    Thanks.

  6. #6
    Join Date
    Jun 2002
    Location
    Winnipeg Canada
    Posts
    4,913

    Default

    Of course it can be done that way, and that may be the easiest way to go about it. I actually have a script (been dead for 2 years now) that actually skims search engines and takes links/descriptions out.

    The one issue with dmoz is the fact that dmoz data is typically not current. Don't get me wrong, it is a GREAT start for any directory site, but it needs current sites to stay fresh.

    My theory was to choose a keyword, which would be a category. And lets say that the directory is about food. So I'll pick a category like hamburgers. Now I'd search for links that are returned for a search on "hamburgers". So that's category 1 done.

    Then I'd move onto another category and so on. Yes it would take some time to do them all, but in the end I'd end up with a unique data set. I would also choose to use different search engines to skim from to keep the data changed up.

    Then comes cleanup in deleting duplicate links, but that's not usually a big deal.

Similar Threads

  1. Update Keyword Mod
    By inspireme in forum Blocks and Modification
    Replies: 8
    Last Post: 01-29-2008, 12:51 PM
  2. how to urlencode keyword?
    By vsevedko in forum v5.x
    Replies: 0
    Last Post: 09-13-2006, 09:37 AM
  3. Keyword
    By gspinney in forum v5.x
    Replies: 17
    Last Post: 07-15-2003, 10:24 PM
  4. How to encode keyword
    By binto in forum v5.x
    Replies: 6
    Last Post: 06-05-2003, 07:22 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •