Rietveld Code Review Tool
Help | Bug tracker | Discussion group | Source code | Sign in
(161)

Issue 4539114: Whoosh indexing work

Can't Edit
Can't Publish+Mail
Start Review
Created:
13 years, 5 months ago by Michael Mayorov
Modified:
13 years, 4 months ago
Visibility:
Public.

Description

Search and indexation for MoinMoin 2

Patch Set 1 #

Total comments: 13

Patch Set 2 : Creating tmp files with tempfile.mkdtemp, fixing some name issues #

Total comments: 1

Patch Set 3 : Changes to indexing schema test #

Total comments: 13

Patch Set 4 : Fixing Thomas Waldmann's and Reimar Bauer's remarks #

Patch Set 5 : add add_document,remove_document, update_document methods #

Total comments: 15

Patch Set 6 : Fixed some stupid mistakes, add (may be) correct names. test_indexing has no fails anymore. #

Total comments: 1

Patch Set 7 : changed testing alghorithm in test_indexing.py #

Total comments: 10

Patch Set 8 : small fixes for test_indexing.py #

Total comments: 2

Patch Set 9 : Rebuilding indexes script #

Total comments: 12

Patch Set 10 : Make some changes in algorithm and value names #

Total comments: 5

Patch Set 11 : Fixes ugly code and some value names #

Patch Set 12 : Using MultiSegmentWriter instead of standard writer #

Total comments: 8

Patch Set 13 : Now we can build partial indexes #

Total comments: 12

Patch Set 14 : Some fixes in building indexes. #

Total comments: 7

Patch Set 15 : minor fixes in building indexes, now working with python2.6 #

Total comments: 1

Patch Set 16 : Add clean option #

Total comments: 7

Patch Set 17 : Fixes in cleaning #

Total comments: 5

Patch Set 18 : WhooshIndex.clean_index() for clearing indexes, changes in rebuild_indexes script #

Total comments: 2

Patch Set 19 : Fixing WhooshIndex class, add test for building indexes #

Total comments: 6

Patch Set 20 : Rename build_index() to create_index(), remove creating-test, some fixes in WhooshIndex #

Total comments: 1

Patch Set 21 : Now if create=False and indexes can not be open, raise FatalError #

Patch Set 22 : Add script for updating indexes #

Total comments: 8

Patch Set 23 : Change algorithm in update_indexes.py #

Total comments: 5

Patch Set 24 : New algo for updating indexes,but for now isnt working,traceback http://paste.pocoo.org/show/416926/ #

Total comments: 7

Patch Set 25 : Fixes in update_script #

Total comments: 1

Patch Set 26 : Fix mistakes in update_indexes #

Patch Set 27 : Update indexes script #

Patch Set 28 : Update indexes, changes in algo #

Total comments: 16

Patch Set 29 : Fix in update_indexes, now working for latest_revisions. #

Patch Set 30 : now using diff for storage and index revisions,handling exceptions for thrashed items #

Total comments: 7

Patch Set 31 : Fixes from previous review #

Total comments: 8

Patch Set 32 : Fixed redundancy #

Total comments: 2

Patch Set 33 : minor changes #

Unified diffs Side-by-side diffs Delta from patch set Stats (+67 lines, -111 lines) Patch
M MoinMoin/script/maint/update_indexes.py View 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 2 chunks +67 lines, -111 lines 0 comments Download

Messages

Total messages: 33
ThomasWaldmann
some comments http://codereview.appspot.com/4539114/diff/1/MoinMoin/search/_tests/test_acl.py File MoinMoin/search/_tests/test_acl.py (right): http://codereview.appspot.com/4539114/diff/1/MoinMoin/search/_tests/test_acl.py#newcode17 MoinMoin/search/_tests/test_acl.py:17: assert False OK, that is not a ...
13 years, 5 months ago (2011-06-06 16:48:10 UTC) #1
ThomasWaldmann
clarify http://codereview.appspot.com/4539114/diff/3002/MoinMoin/search/_tests/test_indexing.py File MoinMoin/search/_tests/test_indexing.py (right): http://codereview.appspot.com/4539114/diff/3002/MoinMoin/search/_tests/test_indexing.py#newcode23 MoinMoin/search/_tests/test_indexing.py:23: what I meant with my previous comment about ...
13 years, 5 months ago (2011-06-06 17:44:31 UTC) #2
ThomasWaldmann
http://codereview.appspot.com/4539114/diff/8001/MoinMoin/search/_tests/test_indexing.py File MoinMoin/search/_tests/test_indexing.py (right): http://codereview.appspot.com/4539114/diff/8001/MoinMoin/search/_tests/test_indexing.py#newcode26 MoinMoin/search/_tests/test_indexing.py:26: schema = getattr(index_object, self.index_name) that is NOT a schema, ...
13 years, 5 months ago (2011-06-08 09:04:42 UTC) #3
Reimar Bauer
tempfile.mkdtemp should get a dir some other minor findings http://codereview.appspot.com/4539114/diff/8001/MoinMoin/search/_tests/test_indexing.py File MoinMoin/search/_tests/test_indexing.py (right): http://codereview.appspot.com/4539114/diff/8001/MoinMoin/search/_tests/test_indexing.py#newcode23 MoinMoin/search/_tests/test_indexing.py:23: ...
13 years, 5 months ago (2011-06-08 09:21:23 UTC) #4
ThomasWaldmann
http://codereview.appspot.com/4539114/diff/3004/MoinMoin/search/_tests/test_indexing.py File MoinMoin/search/_tests/test_indexing.py (right): http://codereview.appspot.com/4539114/diff/3004/MoinMoin/search/_tests/test_indexing.py#newcode27 MoinMoin/search/_tests/test_indexing.py:27: much better now http://codereview.appspot.com/4539114/diff/3004/MoinMoin/search/_tests/test_indexing.py#newcode29 MoinMoin/search/_tests/test_indexing.py:29: index_object = WhooshIndex(index_dir=self.index_dir) that ...
13 years, 5 months ago (2011-06-08 19:34:31 UTC) #5
ThomasWaldmann
looks fine now, just one minor comment if test works, please commit / push. http://codereview.appspot.com/4539114/diff/18001/MoinMoin/search/analyzers.py ...
13 years, 5 months ago (2011-06-08 21:13:25 UTC) #6
ThomasWaldmann
http://codereview.appspot.com/4539114/diff/9005/MoinMoin/search/_tests/test_indexing.py File MoinMoin/search/_tests/test_indexing.py (right): http://codereview.appspot.com/4539114/diff/9005/MoinMoin/search/_tests/test_indexing.py#newcode24 MoinMoin/search/_tests/test_indexing.py:24: "rev_no": 1, first revno is 0 http://codereview.appspot.com/4539114/diff/9005/MoinMoin/search/_tests/test_indexing.py#newcode25 MoinMoin/search/_tests/test_indexing.py:25: "datetime": ...
13 years, 5 months ago (2011-06-11 19:59:04 UTC) #7
ThomasWaldmann
http://codereview.appspot.com/4539114/diff/18002/MoinMoin/search/_tests/test_indexing.py File MoinMoin/search/_tests/test_indexing.py (right): http://codereview.appspot.com/4539114/diff/18002/MoinMoin/search/_tests/test_indexing.py#newcode89 MoinMoin/search/_tests/test_indexing.py:89: (u"itemlinks", u"Home", 2, 1), how can latest be more ...
13 years, 5 months ago (2011-06-11 21:22:01 UTC) #8
ThomasWaldmann
http://codereview.appspot.com/4539114/diff/27001/MoinMoin/script/maint/build_indexes.py File MoinMoin/script/maint/build_indexes.py (right): http://codereview.appspot.com/4539114/diff/27001/MoinMoin/script/maint/build_indexes.py#newcode5 MoinMoin/script/maint/build_indexes.py:5: MoinMoin - Rebuild indexes filename is "build indexes", here ...
13 years, 4 months ago (2011-06-14 19:58:18 UTC) #9
ThomasWaldmann
http://codereview.appspot.com/4539114/diff/13003/MoinMoin/script/maint/build_indexes.py File MoinMoin/script/maint/build_indexes.py (right): http://codereview.appspot.com/4539114/diff/13003/MoinMoin/script/maint/build_indexes.py#newcode24 MoinMoin/script/maint/build_indexes.py:24: latest_revs_fields_name = latest_revs_index.schema.names() latest_rev_field_names http://codereview.appspot.com/4539114/diff/13003/MoinMoin/script/maint/build_indexes.py#newcode26 MoinMoin/script/maint/build_indexes.py:26: all_revs_fields_name = all_revs_index.schema.names() ...
13 years, 4 months ago (2011-06-14 20:50:16 UTC) #10
ThomasWaldmann
http://codereview.appspot.com/4539114/diff/17002/MoinMoin/script/maint/build_indexes.py File MoinMoin/script/maint/build_indexes.py (right): http://codereview.appspot.com/4539114/diff/17002/MoinMoin/script/maint/build_indexes.py#newcode23 MoinMoin/script/maint/build_indexes.py:23: help='Number of processors. Default: 1'), I know it is ...
13 years, 4 months ago (2011-06-15 11:18:55 UTC) #11
Reimar Bauer
add a blank before bracket ( if scripts never become imported to something else initialisation ...
13 years, 4 months ago (2011-06-15 11:20:19 UTC) #12
ThomasWaldmann
http://codereview.appspot.com/4539114/diff/33001/MoinMoin/script/maint/build_indexes.py File MoinMoin/script/maint/build_indexes.py (right): http://codereview.appspot.com/4539114/diff/33001/MoinMoin/script/maint/build_indexes.py#newcode22 MoinMoin/script/maint/build_indexes.py:22: help='Number of processors the writer. Default: 1'), Number of ...
13 years, 4 months ago (2011-06-16 11:37:37 UTC) #13
ThomasWaldmann
http://codereview.appspot.com/4539114/diff/38001/MoinMoin/script/maint/build_indexes.py File MoinMoin/script/maint/build_indexes.py (right): http://codereview.appspot.com/4539114/diff/38001/MoinMoin/script/maint/build_indexes.py#newcode22 MoinMoin/script/maint/build_indexes.py:22: Option('--procs', '-p', required=False, dest='procs', type=int, default=None, Is None a ...
13 years, 4 months ago (2011-06-16 14:08:13 UTC) #14
ThomasWaldmann
http://codereview.appspot.com/4539114/diff/42002/MoinMoin/script/maint/build_indexes.py File MoinMoin/script/maint/build_indexes.py (right): http://codereview.appspot.com/4539114/diff/42002/MoinMoin/script/maint/build_indexes.py#newcode80 MoinMoin/script/maint/build_indexes.py:80: if method == u"both": in line 27, you give ...
13 years, 4 months ago (2011-06-16 19:25:35 UTC) #15
ThomasWaldmann
http://codereview.appspot.com/4539114/diff/39002/MoinMoin/script/maint/build_indexes.py File MoinMoin/script/maint/build_indexes.py (right): http://codereview.appspot.com/4539114/diff/39002/MoinMoin/script/maint/build_indexes.py#newcode29 MoinMoin/script/maint/build_indexes.py:29: or just nothing to lose'), you are duplicating stuff, ...
13 years, 4 months ago (2011-06-17 09:08:00 UTC) #16
ThomasWaldmann
http://codereview.appspot.com/4539114/diff/41003/MoinMoin/script/maint/build_indexes.py File MoinMoin/script/maint/build_indexes.py (right): http://codereview.appspot.com/4539114/diff/41003/MoinMoin/script/maint/build_indexes.py#newcode27 MoinMoin/script/maint/build_indexes.py:27: Option('--clean', required=False, dest='clean', type=bool, choices=(True,), are you sure you ...
13 years, 4 months ago (2011-06-17 13:24:48 UTC) #17
ThomasWaldmann
http://codereview.appspot.com/4539114/diff/44002/MoinMoin/search/indexing.py File MoinMoin/search/indexing.py (right): http://codereview.appspot.com/4539114/diff/44002/MoinMoin/search/indexing.py#newcode100 MoinMoin/search/indexing.py:100: if given_indexname in index_schema that works, but isn't the ...
13 years, 4 months ago (2011-06-18 13:33:50 UTC) #18
ThomasWaldmann
http://codereview.appspot.com/4539114/diff/33005/MoinMoin/search/_tests/test_indexing.py File MoinMoin/search/_tests/test_indexing.py (right): http://codereview.appspot.com/4539114/diff/33005/MoinMoin/search/_tests/test_indexing.py#newcode138 MoinMoin/search/_tests/test_indexing.py:138: for field_name, query, latst_res_len, all_res_len in queries: is there ...
13 years, 4 months ago (2011-06-18 14:53:48 UTC) #19
ThomasWaldmann
http://codereview.appspot.com/4539114/diff/52001/MoinMoin/search/indexing.py File MoinMoin/search/indexing.py (right): http://codereview.appspot.com/4539114/diff/52001/MoinMoin/search/indexing.py#newcode85 MoinMoin/search/indexing.py:85: raise FatalError("can't open whoosh index") wrong place if create ...
13 years, 4 months ago (2011-06-18 15:48:41 UTC) #20
Reimar Bauer
str(key) ? and a unspecific != http://codereview.appspot.com/4539114/diff/54001/MoinMoin/script/maint/update_indexes.py File MoinMoin/script/maint/update_indexes.py (right): http://codereview.appspot.com/4539114/diff/54001/MoinMoin/script/maint/update_indexes.py#newcode42 MoinMoin/script/maint/update_indexes.py:42: metadata = dict([(str(key), ...
13 years, 4 months ago (2011-06-20 15:36:43 UTC) #21
Michael Mayorov
Comments http://codereview.appspot.com/4539114/diff/54001/MoinMoin/script/maint/update_indexes.py File MoinMoin/script/maint/update_indexes.py (right): http://codereview.appspot.com/4539114/diff/54001/MoinMoin/script/maint/update_indexes.py#newcode42 MoinMoin/script/maint/update_indexes.py:42: metadata = dict([(str(key), value) On 2011/06/20 15:36:44, Reimar ...
13 years, 4 months ago (2011-06-20 15:40:15 UTC) #22
ThomasWaldmann
http://codereview.appspot.com/4539114/diff/54001/MoinMoin/script/maint/update_indexes.py File MoinMoin/script/maint/update_indexes.py (right): http://codereview.appspot.com/4539114/diff/54001/MoinMoin/script/maint/update_indexes.py#newcode40 MoinMoin/script/maint/update_indexes.py:40: for rev_no in item.list_revisions(): why are you checking all ...
13 years, 4 months ago (2011-06-20 23:39:06 UTC) #23
Michael Mayorov
http://codereview.appspot.com/4539114/diff/54001/MoinMoin/script/maint/update_indexes.py File MoinMoin/script/maint/update_indexes.py (right): http://codereview.appspot.com/4539114/diff/54001/MoinMoin/script/maint/update_indexes.py#newcode40 MoinMoin/script/maint/update_indexes.py:40: for rev_no in item.list_revisions(): On 2011/06/20 23:39:06, ThomasWaldmann wrote: ...
13 years, 4 months ago (2011-06-21 04:18:09 UTC) #24
ThomasWaldmann
http://codereview.appspot.com/4539114/diff/34002/MoinMoin/script/maint/update_indexes.py File MoinMoin/script/maint/update_indexes.py (right): http://codereview.appspot.com/4539114/diff/34002/MoinMoin/script/maint/update_indexes.py#newcode22 MoinMoin/script/maint/update_indexes.py:22: # yeild dict with fields of each revision by ...
13 years, 4 months ago (2011-06-21 22:42:08 UTC) #25
ThomasWaldmann
http://codereview.appspot.com/4539114/diff/46003/MoinMoin/script/maint/update_indexes.py File MoinMoin/script/maint/update_indexes.py (right): http://codereview.appspot.com/4539114/diff/46003/MoinMoin/script/maint/update_indexes.py#newcode29 MoinMoin/script/maint/update_indexes.py:29: yield revision that doesn't read sane if this function ...
13 years, 4 months ago (2011-06-22 14:29:20 UTC) #26
Michael Mayorov
found bug http://codereview.appspot.com/4539114/diff/46003/MoinMoin/script/maint/update_indexes.py File MoinMoin/script/maint/update_indexes.py (right): http://codereview.appspot.com/4539114/diff/46003/MoinMoin/script/maint/update_indexes.py#newcode29 MoinMoin/script/maint/update_indexes.py:29: yield revision On 2011/06/22 14:29:20, ThomasWaldmann wrote: ...
13 years, 4 months ago (2011-06-22 18:21:30 UTC) #27
ThomasWaldmann
http://codereview.appspot.com/4539114/diff/55003/MoinMoin/script/maint/update_indexes.py File MoinMoin/script/maint/update_indexes.py (right): http://codereview.appspot.com/4539114/diff/55003/MoinMoin/script/maint/update_indexes.py#newcode35 MoinMoin/script/maint/update_indexes.py:35: def yield_backend_rev(item): maybe don't but "yield" in the function ...
13 years, 4 months ago (2011-06-24 11:02:52 UTC) #28
Michael Mayorov
http://codereview.appspot.com/4539114/diff/55003/MoinMoin/script/maint/update_indexes.py File MoinMoin/script/maint/update_indexes.py (right): http://codereview.appspot.com/4539114/diff/55003/MoinMoin/script/maint/update_indexes.py#newcode45 MoinMoin/script/maint/update_indexes.py:45: metadata["rev_no"] = rev_no[0] On 2011/06/24 11:02:52, ThomasWaldmann wrote: > ...
13 years, 4 months ago (2011-06-24 11:57:58 UTC) #29
ThomasWaldmann
http://codereview.appspot.com/4539114/diff/39007/MoinMoin/script/maint/update_indexes.py File MoinMoin/script/maint/update_indexes.py (right): http://codereview.appspot.com/4539114/diff/39007/MoinMoin/script/maint/update_indexes.py#newcode55 MoinMoin/script/maint/update_indexes.py:55: last_documents = [] maybe rather use "latest" instead of ...
13 years, 4 months ago (2011-06-26 08:42:15 UTC) #30
Michael Mayorov
http://codereview.appspot.com/4539114/diff/39007/MoinMoin/script/maint/update_indexes.py File MoinMoin/script/maint/update_indexes.py (right): http://codereview.appspot.com/4539114/diff/39007/MoinMoin/script/maint/update_indexes.py#newcode63 MoinMoin/script/maint/update_indexes.py:63: last_documents.append({name: list(add_rev_nos)[-1]}) # Add last revision On 2011/06/26 08:42:15, ...
13 years, 4 months ago (2011-06-26 09:13:01 UTC) #31
ThomasWaldmann
http://codereview.appspot.com/4539114/diff/51005/MoinMoin/script/maint/update_indexes.py File MoinMoin/script/maint/update_indexes.py (right): http://codereview.appspot.com/4539114/diff/51005/MoinMoin/script/maint/update_indexes.py#newcode24 MoinMoin/script/maint/update_indexes.py:24: # yield dict with fields of each revision by ...
13 years, 4 months ago (2011-06-26 09:44:15 UTC) #32
ThomasWaldmann
13 years, 4 months ago (2011-06-26 11:17:08 UTC) #33
http://codereview.appspot.com/4539114/diff/61001/MoinMoin/script/maint/update...
File MoinMoin/script/maint/update_indexes.py (right):

http://codereview.appspot.com/4539114/diff/61001/MoinMoin/script/maint/update...
MoinMoin/script/maint/update_indexes.py:28: if revs_found: # if doc with this
name doesn't exist
if it is [] if it did not find anything, you can leave away that "if ...".

http://codereview.appspot.com/4539114/diff/61001/MoinMoin/script/maint/update...
MoinMoin/script/maint/update_indexes.py:61: latest_documents.append((name,
list(add_rev_nos)[-1])) # Add latest revision
don't assume that sets have order, they do not.

it can happen that you are lucky and the [-1] is really the latest one, but this
is not the case in general, so assume sets have random order.
Sign in to reply to this message.

Powered by Google App Engine
RSS Feeds Recent Issues | This issue
This is Rietveld f62528b