'[jira] [Commented] (SOLR-10229) See what it would take to shift many of our one-off schemas used for'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       solr-dev
Subject:    [jira] [Commented] (SOLR-10229) See what it would take to shift many of our one-off schemas used for
From:       "Erick Erickson (JIRA)" <jira () apache ! org>
Date:       2017-03-31 23:54:41
Message-ID: JIRA.13048390.1488750532000.181294.1491004481707 () Atlassian ! JIRA
[Download RAW message or body]


    [ https://issues.apache.org/jira/browse/SOLR-10229?page=com.atlassian.jira.plugin. \
system.issuetabpanels:comment-tabpanel&focusedCommentId=15951825#comment-15951825 ] 

Erick Erickson commented on SOLR-10229:
---------------------------------------

A very minor nit, I'd drop the 's' on "withAttributes", i.e. "withAttribute" since \
we're only setting one at a time for all we're chaining them together.

I don't think you need to set solr.solr.home, and I'm pretty sure setting it to the \
sub-directory will cause Bad Things To Happen. I _think_ that something like this \
should work if you create a "mother_configset/conf/managed-schema" & etc in \
solr/core/src/test-files/solr/configsets: \
TEST_PATH().resolve("configsets").resolve("mother_configset").resolve("conf").resolve("managed-schema").


On second thought, I'd probably rather not make a new configset as the whole point \
here is _NOT_ to load this up per test. So just putting a single file along with the \
other files something like: \
TEST_PATH().resolve("collection1").resolve("conf").resolve("mother-schema") might be \
preferable. Actually I think I like putting it here rather than a new configset, but \
that's not a strong preference.

Although we may want to name it something more descriptive, maybe template-schema or \
something.... schema-to-use-for-creating-fields-on-the-fly is a little too long \
though.

It looks like you're thinking to have test classes subclass this. Could it be \
instantiated as a static member of SolrTestCaseJ4 somehow? I think that's less \
confusing and all current tests would immediately have access. The only thing I see \
on a quick glance that really requires SolrTestCaseJ4 is h.getCore(), so that would \
probably mean we need to pass the core in to the methods that need it.

This last is pretty certain to be something we want to do or similar. Using \
h.getCore() doesn't accommodate having different cores with different schemas in the \
same test.

I doubt we should persist any changes. Tests should fail if we try since there's code \
in place to prevent changing any source files and unless we copied the managed schema \
being loaded to a temp directory, persisting should fail. In this case whatever \
schema the test loaded should be considered a "source file".

I like where this is going, it'll be exciting to get it in place.


> See what it would take to shift many of our one-off schemas used for testing to \
>                 managed schema and construct them as part of the tests
> --------------------------------------------------------------------------------------------------------------------------------------
>  
> Key: SOLR-10229
> URL: https://issues.apache.org/jira/browse/SOLR-10229
> Project: Solr
> Issue Type: Improvement
> Security Level: Public(Default Security Level. Issues are Public) 
> Reporter: Erick Erickson
> Assignee: Erick Erickson
> Priority: Minor
> Attachments: SOLR-10229.patch
> 
> 
> The test schema files are intimidating. There are about a zillion of them, and \
> making a change in any of them risks breaking some _other_ test. That leaves people \
> three choices: 1> add what they need to some existing schema. Which makes schemas \
> bigger and bigger and bigger. 2> create a new schema file, adding to the \
> proliferation thereof. 3> Look through all the existing tests to see if they have \
> something that works. The recent work on LUCENE-7705 is a case in point. We're \
> adding a maxLen parameter to some tokenizers. Putting those parameters into any of \
> the existing schemas, especially to test < 255 char tokens is virtually guaranteed \
> to break other tests, so the only safe thing to do is make another schema file. \
> Adding to the multiplication of files. As part of SOLR-5260 I tried creating the \
> schema on the fly rather than creating a new static schema file and it's not hard. \
> WDYT about making this into some better thought-out utility?  At present, this is \
> pretty fuzzy, I wanted to get some reactions before putting much effort into it. I \
> expect that the utility methods would eventually get a bunch of canned types. It's \
> reasonably straightforward for primitive types, if lengthy. But when you get into \
> solr.TextField-based types it gets less straight-forward. We could manage to just \
> move the "intimidation" from the plethora of schema files to a zillion fieldTypes \
> in the utility to choose from... Also, forcing every test to define the fields \
> up-front is arguably less convenient than just having _some_ canned schemas we can \
> use. And erroneous schemas to test failure modes are probably not very good fits \
> for any such framework. [~steve_rowe] and [~hossman_lucene@fucit.org] in particular \
> might have something to say.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org


[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic