Page 1 of 3 123 LastLast
Results 1 to 15 of 39

Thread: DMOZ Extractor

  1. #1
    Join Date
    Apr 2003
    Location
    Atlanta GA
    Posts
    3,395

    Default DMOZ Extractor

    Has anyone used the Extreme DMOZ Extractor that was loaded to the Download page on 16 Oct?
    esm
    "The older I get, the more I admire competence, just simple competence, in any field from adultery to zoology."

    .

  2. #2
    Join Date
    Aug 2001
    Location
    Indonesia
    Posts
    3,732

    Default

    esm, any question about extreme?
    The trial version can extract database up to 1000 records. No filter and file split.

  3. #3
    Join Date
    Apr 2003
    Location
    Atlanta GA
    Posts
    3,395

    Default

    justing wondering if anyone had any experience with it. I don't have a need for it myself. Unless I try to build some link sites.

    file split?
    esm
    "The older I get, the more I admire competence, just simple competence, in any field from adultery to zoology."

    .

  4. #4
    Join Date
    Aug 2001
    Location
    Indonesia
    Posts
    3,732

    Default

    When extracting dmoz database, you can easily have thousands web links. Say you get 43,000 web links. You can tell Extreme to split the result in a few smaller files, 5000 web links each file.

    This is useful if you will work with excell for further process, or you want to import into indexu. It's good idea to import 5000 records 9 times rather than 43,000 records at once.

  5. #5
    Join Date
    Dec 2003
    Posts
    4

    Default

    I downloaded extreme dmoz extractor but during the installation I get the following error:

    C:\windows\system32\msxml4.dll
    Unable to register the dll/ocx: loadlibrary failed; code 1114
    A dynamic link library (DLL) initialization routine failed.


    and when I try to parse the data I get:

    Run time error '429'
    ActiveX component can't create object


    Would you please help me?

  6. #6
    Join Date
    Aug 2001
    Location
    Indonesia
    Posts
    3,732

    Default

    ms xml 4 is failed to install. Let me know your windows version.

  7. #7
    Join Date
    Dec 2003
    Posts
    4

    Default

    I downloaded the file from microsoft and now your program appears to work,
    but when I try to do a partial parse I only get the first 1000 results of
    Top/Arts. I followed your instructions as you mention for Top/Sports/Soccer
    but it always returns data for Top/Arts/movies

    When I look at the "summary" it says:
    Filters: No


    Why is that? Am I doing anything wrong? is the program?

  8. #8
    Join Date
    Aug 2002
    Location
    Germany
    Posts
    1,180

    Default ???

    Extreme DMOZ Extractor ?

    Where can I find it ???

    Frank

  9. #9
    Join Date
    Apr 2003
    Location
    Atlanta GA
    Posts
    3,395

    Default

    esm
    "The older I get, the more I admire competence, just simple competence, in any field from adultery to zoology."

    .

  10. #10
    Join Date
    Aug 2002
    Location
    Germany
    Posts
    1,180

    Thumbs down ..

    Found it

    http://www.nicecoder.com/community/s...&threadid=1900

    The same problem like me, Dody, I begun to be a little bit angry, seems that guy is waiting since 12 days for an answer and now I spend 1 hour of my time to see that I have the same problem and nothing is working.

    Is this program working, this means did you test ist succesful ?

    Frank

  11. #11
    Join Date
    Apr 2003
    Location
    Atlanta GA
    Posts
    3,395

    Default

    nope, never tried it. That's why I asked if anyone had.
    esm
    "The older I get, the more I admire competence, just simple competence, in any field from adultery to zoology."

    .

  12. #12
    Join Date
    Aug 2002
    Location
    Germany
    Posts
    1,180

    Thumbs down ...

    Do not waste your time
    I tried it 2 hours and because of my nearly 30 not answered mails to the staff I also tried various cracks, but the problem is, that the program won't work and because of the missing parent_cat_id at the moment I have no idea how it should work...

    Frank

  13. #13
    Join Date
    Aug 2001
    Location
    Indonesia
    Posts
    3,732

    Default Re: ...

    Originally posted by Frank71
    Do not waste your time
    I tried it 2 hours and because of my nearly 30 not answered mails to the staff I also tried various cracks, but the problem is, that the program won't work and because of the missing parent_cat_id at the moment I have no idea how it should work...

    Frank
    What?? It's strange! I have replayed your email frank. I did 3 emails from you 2 days ago.
    Where you post your email? It should be support@nicecoder.com. Do not email me at support@indexu.com and support@sentraweb.com.

  14. #14
    Join Date
    Aug 2002
    Location
    Germany
    Posts
    1,180

    Default ???

    Hello Dody,

    sorry but I mailed to all your emails and you should have also several unread messages in your forum box - I'm talking about emails over the whole year...

    Frank

    Btw: The dmoz tol still doesn't work.

  15. #15
    Join Date
    Aug 2001
    Location
    Indonesia
    Posts
    3,732

    Default

    Hi Frank, I posted my replay here too so the others that have the same problem can solve their problem here.



    Ok, according to your explaination, I assume that you want to extract dmoz's database which listed under
    http://dmoz.org/World/Deutsch/Online-Shops/ (there're over 10,000 web links here).
    I'm not sure where you failed, when extracting database or when import extracted database into indexu


    extracting dmoz's database:
    -------------------------------------

    1. Make sure you have downloaded dmoz.org's database here http://rdf.dmoz.org
    You should get 2 files from there: structure.rdf.8u.gz and content.rdf.u8.gz. Those 2 files are zipped, so you must unzip them first before you can use. You may use winzip or winrar software to extract .gz files. Now you should have structure.rdf.8u and content.rdf.u8

    2. Open Extreme Dmoz Extractor. First you will need to extract the categories. Your input should be:
    - File type: "Category hierarchy information"
    - Open RDF file: c:\structure.rdf.8u (should point to your structure.rdf.8u file location in your harddisk)
    - Save output: c:\dmoz\online-shop-cat.txt (it's result where your output file will be generated)
    - Arrange default data field: <click default button>
    - Set root: Top/World/Deutsch/Online-Shops (notice this input do not ends with /)
    - Split files: (leave this empty)

    3. Click Parse

    That is now you should have output files. Remember that if you use evaluation version, you're limited to extract 1000 records only. It will automatically stop to parse when it reach 1000.

    4. Follow the same step above to parse web links file content.rdf.8u


    Importing into indexu 3.1
    -------------------------------

    1. Make sure you have installed Extreme Patch. This patch contain functionality to import data extracted from Extreme DMOZ Extractor. You can download patch here: http://www.nicecoder.com/download.php

    2. Remember that data extracted from Extreme may not able to be used with your existing database. They have different category structure (the id, parent_id, etc). So you should use empty database. Otherwise use ms excell or text editor to syncronize them.

    3. Go to indexu administration -> database tables -> import
    Then select file format to Extreme DMOZ Extractor. Click Import. Remember to import category first before web links

    4. Follow the above steps to import web links.

    5. Then you need to recalculate web links number
    Click Administrator area -> Tools -> Update Number of Links.

    I think I have clearly explain the step here
    http://www.nicecoder.com/dmoz_extractor.php

    Let me know which step you have failed.

Similar Threads

  1. Launched Eonlinestores.com
    By lordkinjo in forum Sites in Action
    Replies: 13
    Last Post: 05-19-2007, 03:52 PM
  2. Extractor and Spanish
    By manuel_pr in forum v5.x
    Replies: 9
    Last Post: 03-07-2005, 07:36 AM
  3. extreme DMOZ Extractor doesn't work ?!!
    By Frank71 in forum Pre-Sales Questions
    Replies: 3
    Last Post: 09-15-2004, 05:32 AM
  4. Dmoz extractor
    By johngreen in forum v5.x
    Replies: 13
    Last Post: 05-11-2004, 10:13 PM
  5. Using the Dmoz Extractor
    By Polo5 in forum v5.x
    Replies: 10
    Last Post: 04-29-2004, 02:43 PM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •