![]() |
|
Thread Tools |
#751
|
|||
|
|||
16.5 mil posts
488k threads vb 3.6 no plans to move to vb 4 until everyone else does it, too. boolean and phrase search are needed. been missing them. spending for a search solution - no problem. spending 2k - no way.
__________________
eXBii.com - Indian community
no XB no fun know XB know fun ! No members have liked this post.
|
#752
|
||||
|
||||
Okay, I've been working slowly but surely... Here's the following constraints thus far:
1. New threads/posts added when you run your delta cron job (most run every 2-5 min)... 2. Changes in # views, last poster, deleted threads / posts, etc should be real time updates. 3. Edits to the title or post text will not be updated until next full re-index (usually nightly) unless it is within the delta file. Will have boolean searching, phrase, etc...
__________________
My Site: EXTREME Overclocking Do not PM me with your iTrader problems or asking for the code. I will just delete your PM without reading it. No members have liked this post.
|
#753
|
|||
|
|||
Originally Posted by eoc_Jason
One thing I'd like to have that we don't currently have, is properly ordered search results. If you don't do full reindexing on a regular basis, they tend to get really out of order.
![]()
No members have liked this post.
|
#754
|
|||
|
|||
Apart from using Sphinx to search for the similar threads, you can also use it to generate the post excerpts with search keywords highlighted when in the "Show search results as posts" mode.
Our stats: almost 14 mln posts, 1.1 mln threads, 300k users, vB 3.8. We're using our own Sphinx implementation since it predates the hack in this thread. We got rid of the obscure search and sort modes though (such as sorting by the number of views or replies), and there was not a single complaint from our members. I don't think you should focus too much on 100% compliance with the default search. Having too many document attributes will inflate the index size, resulting in more I/O and more sluggish performance. If you are worried about the need to edit the default search form template, you could always clone it, make the necessary changes and ship it with the product. No members have liked this post.
|
#755
|
||||
|
||||
Thanks for the feedback guys. Another thing I'm pondering on is instead of trying to work off just a main + delta index is to break the total post count up and constantly rotate smaller indexes...
I.E. If a site has 10,000,000 posts... Have 10 indexes each with 1,000,000 threads. Then have each of the indexes rotate say hourly. This would be a shift from the typical one massive re-index nightly (or however often you do it). In theory too, the last index would contain the most recent posts and could be re-indexed more often. I dunno, that's just a thought... My concern right now is the core code for searching, the indexes themselves can be manipulated differently at a later time as that is transparent to everything else.
__________________
My Site: EXTREME Overclocking Do not PM me with your iTrader problems or asking for the code. I will just delete your PM without reading it. No members have liked this post.
|
#756
|
|||
|
|||
Originally Posted by eoc_Jason
That's what we're doing, too, though the delta is still there. The bonus is that you can set up a distributed index with the number of agents equal to the number of CPUs, like described here, to take advantage of all CPUs in the server. However it's more of a manual operation, it would be hard to generate a partitioned sphinx.conf automatically.
![]()
No members have liked this post.
|
#757
|
||||
|
||||
kmike - thanks for that info, I must over looked over that in the docs...
Just curious, how much of a performance difference did you see using the distributed process? I kind of got sidetracked today... One of my good friend's wife just got out of the hospital, so I was there for a while today. Then I was coding some anti-spammer measures for my forum registration process...
__________________
My Site: EXTREME Overclocking Do not PM me with your iTrader problems or asking for the code. I will just delete your PM without reading it. No members have liked this post.
|
#758
|
|||
|
|||
We have 2 post indexes, one or our live post table, and one for our archived post table. They each have 30 million posts each. I don't see a point in sharding the post indexes aside from being able to take advantage of multiple CPUs when indexing.
The way I see it, if I can keep the old indexes online while I do a full reindex, I don't really care how long the full reindex takes since (at least in our case), the search server is just a slave database server and not our primary. No members have liked this post.
|
#759
|
||||
|
||||
The only thing I am waiting on before converting to vB4 is sphinx (or a working search alternative). The rest of the little stuff I modded I can do with or without until those developers get upgrades.
1.3 million threads 18 million posts
__________________
KEVLAR www.bimmerforums.com No members have liked this post.
|
#760
|
|||
|
|||
mute, can you share how did you archive post table ? What changes did you do in code and MySQL ? I want to move my old posts to another post_archive table but I am not sure how can I join those tables from vbulletin code.
eoc_Jason my forum is 200k threads and 10mil posts, vb 3.8.4. I have only one database (no slave), nginx webserver, Core I7 with 12GB RAM. I installed sphinx on server and from ssh it works great but from moded search.php it works very strange, sometimes when I want to find some keywords with option "show results as posts" it returns "no results" message but if I change search options to "show results as thread" with same keywords, I got good numbers of results showen as threads. Users posts search does not works at all, search.php?do=finduser&u=xxx always gives blank screen no php errors in log or anywhere just blank screen and thats it. No members have liked this post.
|
#761
|
|||
|
|||
__________________
eXBii.com - Indian community
no XB no fun know XB know fun ! No members have liked this post.
|
#762
|
|||
|
|||
![]() I think spliting big post table to smaller read only archived tables will be cheaper and even better solution and of couse Sphinx for search. No members have liked this post.
|
#763
|
|||
|
|||
Originally Posted by kris
It's REALLY nasty. I really don't think you want to do it. In fact, I'm thinking about abandoning it on our site.![]()
Back when we wrote it, we were probably at like 25 million posts, on (if I remember right), like a dual xeon with HT. Now, we're on a Quad Quad xeon box with 16gb of ram. We have 30 million in our archived tables (10x3 mill posts each) + 30 million in our post table. I'm not seeing any slowdowns against the post table, which has me wondering if we'd be seeing any slowdowns if I was pulling against all 60 million in one table, given how much faster our CPUs have gotten and how much ram we have sitting around. No members have liked this post.
|
#764
|
|||
|
|||
Hi,
I have sphinx installed for my wiki since a few days, and now am looking to get it working with my vbulletin setup, But, my server load went a bit overboard this morning (120+) and I have no idea how to work with that,..... any tips on taking some presure of the load? Before I add this to vbull? Some server stats (dedicated) Processor #1 Vendor: GenuineIntel Processor #1 Name: Intel(R) Core(TM)2 Duo CPU E8300 @ 2.83GHz Processor #1 speed: 1998.000 MHz Processor #1 cache size: 6144 KB Processor #2 Vendor: GenuineIntel Processor #2 Name: Intel(R) Core(TM)2 Duo CPU E8300 @ 2.83GHz Processor #2 speed: 1998.000 MHz Processor #2 cache size: 6144 KB No members have liked this post.
|
#765
|
|||
|
|||
go through this thread over at vb.com
__________________
eXBii.com - Indian community
no XB no fun know XB know fun ! No members have liked this post.
|
![]() |
«
Previous Thread
|
Next Thread
»
Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
Thread Tools | |
|
|
New To Site? | Need Help? |
All times are GMT. The time now is 21:25.