Page 1 of 1

[solved] mysql: copy value from column to column and perform string operation

Posted: Sat Jun 13, 2009 2:11 pm
by nhaack
Hi there,

I am currently working on a kind of CMSMS experiment. For this I have imported some tables of data (own data structure, containing third party content). For simplification the table layout is:

ID: Integer (e.g. "12345")
Name: varchar(80) (e.g. "Some Text Entry")
Summary: text (e.g. "The summary of the topc")

To acccess the data, I want to use pretty URLs like: /page-alias/some-text-entry/
So I plan to use a rewrite rule like this:

RewriteCond %{REQUEST_URI} ^/page-alias/
RewriteRule ^([^/]*)/([^/]*) cmsms/index.php?page=page-alias&item-alias=$2 [QSA,L]

With smarty I capture the get value, pass it on to a plug-in or UDT and check the DB for the corresponding element. Then I can output the other item data.

The URL stuff works without problem. DB import was no problem and using the url to query the data also works (I manually edited some fields.

To problem comes, when I want to find the entry in the DB. Using the regular Name-field doesn't really work. "Some Text Entry" and "some-text-entry" do not match for obvious reason.

I can not just remove the "-" in the item-alias and uppercase the single words first letter to match, as the Name from the DB  may contain special characters and such.

I can generate a nice alias from the original name by replacing specials characters and spaces and such with hyphens or ascii equivalents and lowercase the data. it basically does the same as CMSMS when auto creating a page alias.

So I got the single bits working nice but stuck at putting them all together. Probably this is more a mysql question than directly CMSMS related, but probably someone of you knows a quick way :)

I see these options:

A) Either I transform the Name with mysql-query and then compare it to the item-alias recieved, but then I need to transform a lot of entries and thus might generate a lot of CPU load for each query. Additionally, I can not index the item-aliases in the DB as they are generated on request.

B) I could create a new column named alias. Then I run a mysql query, that reads column name, does string operation and saves the result in the column alias. This would only have to be done once or each time the DB get's updated and is pure mysql.

C) I create the extra column and then run a PHP script that goes column by column and copy saves the text. This would also only be an operation that is done for import and update. However, the DB tables can be very large with several million entries. I think writing the script is not the problem.

The stuff is running on a dedicated machine. Max execution times or memory are no problems. eAccelerator and mysql caching are available (it's just amazing how fast CMSMS can run in such an environment even with complex templates).

I personally would pick Option C, as I think that's something I know how to do. I just wonder whether this is the way to go in terms of good programming and efficiency?  

So can anybody give me a push into the right direction? Any other ideas how this could be solved?

Thanks and best
Nils

Re: mysql question: copy value from column to column and perform string operatio

Posted: Sun Jun 14, 2009 6:11 pm
by viebig
I would use php... or a sandalone script to convert all the titles to aliases, or even create something like:

add and extra field page_alias

on the page call.. something like

select * from table where alias=GET['alias']
if not found
select * from db where titleToAlias(title)=$page_alias
if not found 404
else
update table set alias  = titleToAlias(title) where title = title
assign variables and show the page

This would create an alias for each call, updating the db.. the next call to the same alias would not fall on the second query and the update....

A robot can easily trigger all that urls.. if they´re appear somewhere, and depending on your traffic you´ll nhave your db converted gradually and constantly.

This is flawless...a C code would take some time to develop, since the alias generations functions would have to be rewritten, with all the encoding, replacements.. etc... and if a comma is wrong, you can have a wrong alias, comparing to the php function that does that.

I did this for a site with millions of pages(in fact billions). Nor compromised the server load.... but slow down the page creation since I was getting the "db" from a external xml file(0.3), and processing more than 30-40(1-5 sec) per page load. Note that server was on bluehost(very slow shared server)... The solution was stable, and on every hit the db is updated... flawless.

EDIT: I did a youtube clone using gdata api... got banned from google, so just experiment... if you dont have unique content there´s a higgh possibility that your site gets banner after a while.

One solution is assigning noindex meta robots, on pages that dont have some custom and unique content.. like comments, images, ratings, reviews... google didint like my "infinite" site. Was a experiment. check it at qual-melhor.com.br...  Yahoo didn´t mind and is indexing http://siteexplorer.search.yahoo.com/se ... URL&fr=sfp

Regards

G

Re: mysql question: copy value from column to column and perform string operation

Posted: Sun Jun 14, 2009 7:20 pm
by nhaack
Hi Viebig,

thanks for the reply and the thoughts. Actually a nice idea to do it gradually.

However, tonight, I couldn't wait and generated all needed alias for the biggest table as a test. I created a new column and had a little php script that went through the DB row by row and updated the alias field with the generated alias. These were roughly 10m entries. I startet the batch run right before bed time and this morning it was done. The site is not yet publicly available, so i took the chance to let the server sweat :). The method is not really fast though. A pity, i forgot to look at the final generation time, but I would say it made~25k alias per minute (via UDT running within CMSMS).

Interesting that you mention" infinite site". It somehow plays into that direction. The content is quite a mixture coming from various sources. There is no real community function of now. Though I plan something like that in the future.

At an earlier stage, the site was mostly dependent an APIs making it very slow on initial call with nothing cached. I also played with the idea of robots triggering. So that the cache would already contain basic data and would only update this and that on user request, but still, performance was not satisfying.

So, I got this one solved at the very end.

Thank you and best regards
Nils

Re: [solved] mysql: copy value from column to column and perform string operatio

Posted: Sun Jun 14, 2009 7:38 pm
by viebig
Great!

Not that the generation time on qual-melhor is primarrily affect by the connections between bluehost and youtube... nothing is cached...

Regards

g

Re: [solved] mysql: copy value from column to column and perform string operation

Posted: Sun Jun 14, 2009 7:42 pm
by nhaack
Just out of couriosity, how many api calls do you make per page request on qual-melhor, do you aggregate several API? Did you file your IP for whitelisting somewhere at youtube? Did you experience any quota limits with them?

Best
Nils