How to enter thousands of web links

27.5. How to enter thousands of web links

Figure 27-4. Administration panel: Web Links.

Administration panel: Web Links.



You have gathered, or you have found , thousands of web links that you would like to enter in the PHP-Nuke Web Links module. You don't want to do this manually. You can use sed to produce the MySQL commands in a script file that you will then feed to MySQL, as described in How to enter thousands of web links in PHP-Nuke.

The first step is of course to find the page that contains all the links you need. This may be a HTML file, but it may also be a CVS file from an exported MS Access data - this doesn't matter, the procedure is the same: edit the source with sed to convert every line that has a link and a description to an SQL command that inserts the link and its description to the aproppriate table.

Example: Let's consider the page Content Management - PHP-Nuke of the Open Directory (DMOZ) for a real world example. Save it as dmoz.html. To enter all the web links contained in dnoz.html into the WEB_Links module[1], we have to create the appropriate MySQL commands. To this end, save the following sed script under the name "sedscr_downloads":

s/<li><a href="\([^"]*\)">\(.*\)<\/a>\(.*\)$/INSERT INTO nuke_links_links
( lid, cid, sid, title, url, description, date, name, email, hits, submitter, linkratingsummary, totalvotes, totalcomments ) 
VALUES ( '0', '100', '0', '\2', '\1', '\3', '2003-06-23 06:05:45', 
", ", ", ", ", ", ");/p 

This is a one-line sed command that creates the SQL INSERT query for every line of dmoz.html that contains a link. It takes the links's URL, title and description and stores them in sed's internal numbered fields 1, 2 and 3 respectively. It then uses the values of those fields [you can see them as escaped numbers \1, \2 and \3 in the VALUES part of the query) to fill the appropriate fields of the nuke_links_links table. You can download this sed script from sedscr_downloads. Adapt sedscr_downloads to your situation (e.g. change cid from 100 to the right category id, add values for name and email, change the date or the prefix of the links_links table). Then, run it as follows:

sed -n -f sedscr_downloads dmoz.html > dmoz.sql

This will produce the file dmoz.sql, containing lines like this one:

INSERT INTO nuke_links_links( lid, cid, sid, title, url, description, 
date, name, email, hits, submitter, linkratingsummary, totalvotes, 
totalcomments ) VALUES ( '0', '100', '0', 'PHPNuke: Management and Programming',
'http://www.karakas-online.de/EN-Book/', ' - This tutorial describes the 
installation and structure of PHPNuke. It also delves into more advanced issues, 
like the programming of PHPNuke blocks and modules.', '2003-06-23 06:05:45', 
", ", ", ", ", ", ");

A line like the above will insert a Web Link in the module, when ran from the MySQL command line. As you can see, dmoz.sql contains the complete set of INSERT commands for all links contained in dmoz.html. To execute the commands from MySQL, type

mysql < dmoz.sql

and all links will be inserted in the Web Links module.

Tip A big time saver!
 

If you have thousands (or even hundreds) of web links to enter, this method will save you days of typing! That's the power of sed!

Since I used DMOZ (or ODP, Open Directory Project) as an example, let me add that there are a lot of scripts out there that can help you include the DMOZ data in your website. An example is phpODP. phpODP enables you to add the content of The Open Directory Project to your own website. You can let your visitors browse the categories, or search the directory - 100% locally on your site. The content is updated in real-time - so when ODP updates, you update!

Tip Search for ODP tools in ODP itself!
 

There is even an ODP category that is devoted to scripts and tools that can scrap ODP data for you: Computers: Internet: Searching: Directories: Open Directory Project: Use of ODP Data: Upload Tools . It includes links to more than 20 tools that can do this, so I think one of those should fit your purposes, if the method described here is not the right one for you. In any case, it is a source of a wealth of ideas that you can explore!

Notes

[1]

If you are looking for a ready-made ODP module for PHP-Nuke, see Section 8.3.13.


Help us make a better PHP-Nuke HOWTO!

Want to contribute to this HOWTO? Have a suggestion or a solution to a problem that was not treated here? Post your comments on my PHP-Nuke Forum!

Chris Karakas, Maintainer PHP-Nuke HOWTO

 

Site Info v2.2.2

Last SeenLast Seen
Server TrafficServer Traffic
  • Total: 337,498,634
  • Today: 35,174
Server InfoServer Info
  • Sep 26, 2017
  • 11:20 am PDT
 
 

Daily Inspiration