Finding out a Page's Smarty/CMS dependencies

sloop · Post by **sloop** » Fri Apr 07, 2006 1:02 am

A question for you advanced developers: we all know a given Page content is based on a template, and the template and content/block fields may reference HtmlBlobs, tags, user tags, and modules.

How do you find out what those are?

I want to know because, while the CMS database knows what template a Page content uses, it doesn't know the other dependencies, and for an admin enhancement I'm coding it would be very worthwhile to have that info, without having to parse it out myself.

Any tips or ideas greatly appreciated!

-sloop

Ted · Post by **Ted** » Fri Apr 07, 2006 12:23 pm

I'm actually seeing this limiatation as well now. In order to do Search the right way, I need to have this ability. There is currently no real way to do this without parsing everything by hand.

At some point down the road, I'll put this in, since it's becoming kind of necessary to me as well.

sloop · Post by **sloop** » Fri Apr 07, 2006 1:06 pm

Okay - y'know, I took a spin through the bundled Smarty code, hoping to find something like a static method that'd return useful info (Smarty::_fetch_resource_info sounded promising), but haven't gotten it yet. I'll take another look today, and thinkit could go either way: maybe it builds a tree of nested objects as it processes an object (::fetch, ::display), or maybe it just gathers what it needs and doesn't keep the objects it encounters in memory. If I find anything worthwhile I'll post back.

Ted · Post by **Ted** » Fri Apr 07, 2006 3:00 pm

Great, thanks!

petert · Post by **petert** » Thu Apr 20, 2006 11:32 am

Ted wrote: I'm actually seeing this limiatation as well now. In order to do Search the right way, I need to have this ability. There is currently no real way to do this without parsing everything by hand.

Please use a spider for the search, it's the only proper way to do a search on a website.

Ted · Post by **Ted** » Thu Apr 20, 2006 12:15 pm

There are spiders that already exist out there. No point in reinventing the wheel. The issue is that they take a lot of work to setup and aren't instantanous with changes. The search module I'd be doing is going to use the content directly, and allow modules to register/unregister it's contents.

petert · Post by **petert** » Thu Apr 20, 2006 12:31 pm

what about rendered content? There will be loads of problems when you just search on content from the database.
And if you fix it so that content is rendered as it is supposed to then you will have reinvented the wheel

sloop · Post by **sloop** » Thu Apr 20, 2006 3:12 pm

I've been using something called RI-Search, it's a PHP-based search engine that will spider both your site and files in the filesystem. Written by a bad-ass Russian engineer, quite efficient and quick.

http://risearch.org/eng/risearch/index.html

While it's not integrated directly into CMS on my sites, it's a great candidate for transmogrification into a plugin or module, which I'll do as soon as I get to it (i.e. in a couple of weeks if nobody beats me to it.)

It shouldn't be too difficult to have it index content obtained from CMS directly, avoiding HTTP overhead, though it'd have to be refarkled to keep it from trying to follow links.

petert · Post by **petert** » Thu Apr 20, 2006 3:27 pm

sloop wrote: I've been using something called RI-Search, it's a PHP-based search engine that will spider both your site and files in the filesystem. Written by a bad-ass Russian engineer, quite efficient and quick.

It says on the site it is in Perl

I use phpdig, it's in php and very fast. http://www.phpdig.net
I run the spider after I am done editing pages or adding news items, works great and fast

sloop · Post by **sloop** » Thu Apr 20, 2006 3:38 pm

Oops, wrong link...

http://risearch.org/eng/risearch_php/index.html

Battle of the search engines! One of the things I like about RISearch PHP is that it writes out its search index as relatively compact files. No database overhead when searching, and it can be used to index large amounts of text, hundreds of megs if you choose, and still be quite fast. One thing I don't like is it doesn't perform incremental indexing - it has to respider the entire site or text base in order to regenerate the index. However, one of the "Pro" versions, I believe, provides incremental updates.

It's all good... what works best for you, and ultimately getting an engine integrated with CMS would be great, particularly for seach in the Admin panel - that'd be fantastic.

Update on dependencies: I've written a parser that will take apart both Smarty and HTML tags. Currently using for discovery of a page's inline and CSS assets (images, PDFs, CSS links, etc.), it could be used to extract tagnames for generating a list of tags, blobs, user-defined tags, and Modules, so you'd have the complete set of assets a piece of content requires.

This is all in support of a version-control module I've been tinkering with.

Elijah Lofgren · Post by **Elijah Lofgren** » Tue May 09, 2006 3:55 pm

sloop wrote: This is all in support of a version-control module I've been tinkering with.

You mean so we could roll-back changes to pages? Sounds very useful.

CMS Made Simple Forums

Finding out a Page's Smarty/CMS dependencies

Finding out a Page's Smarty/CMS dependencies

Re: Finding out a Page's Smarty/CMS dependencies

Re: Finding out a Page's Smarty/CMS dependencies

Re: Finding out a Page's Smarty/CMS dependencies

Re: Finding out a Page's Smarty/CMS dependencies

Re: Finding out a Page's Smarty/CMS dependencies

Re: Finding out a Page's Smarty/CMS dependencies

Re: Finding out a Page's Smarty/CMS dependencies

Re: Finding out a Page's Smarty/CMS dependencies

Re: Finding out a Page's Smarty/CMS dependencies

Re: Finding out a Page's Smarty/CMS dependencies