[prev in list] [next in list] [prev in thread] [next in thread] 

List:       cassandra-dev
Subject:    Re: [DISCUSS] Allow UPDATE on settings virtual table to change running configuration
From:       Maxim Muzafarov <mmuzaf () apache ! org>
Date:       2023-06-23 11:50:13
Message-ID: CADiQCWJCgHgRUw2ekeSDSU0kN4HPBC14xLtQFJxyPwZ8Na+BQQ () mail ! gmail ! com
[Download RAW message or body]

Hello everyone,


As there is a lack of feedback for an option to go on with and having
a discussion for pros and cons for each option I tend to agree with
the vision of this problem proposed by David :-) After a lot of
discussion on Slack, we came to the @ValidatedBy annotation which
points to a validation method of a property and this will address all
our concerns and issues with validation.

I'd like to raise the visibility of these changes and try to find one
more committer to look at them:
https://issues.apache.org/jira/browse/CASSANDRA-15254
https://github.com/apache/cassandra/pull/2334/files

I'd really appreciate any kind of review in advance.


Despite the number of changes +2,043 −302 and the fact that most of
these additions are related to the tests themselves, I would like to
highlight the crucial design points which are required to make the
SettingsTable virtual table updatable. Some of these have already been
discussed in this thread, and I would like to provide a brief outline
of these points to facilitate the PR review.

So, what are the problems that have been solved to make the
SettingsTable updatable?

1. Input validation.

Currently, the JMX, Yaml and DatabaseDescriptor#apply methods perform
the same validation of user input for the same property in their own
ways which fortunately results in a consistent configuration state,
but not always. The CASSANDRA-17734 is a good example of this.

The @ValidatedBy annotations, which point to a validation method have
been added to address this particular problem. So, no matter what API
is triggered the method will be called to validate input and will also
work even if the cassandra.yaml is loaded by the yaml engine in a
pre-parse state, such as we are now checking input properties for
deprecation and nullability.

There are two types of validation worth mentioning:
- stateless - properties do not depend on any other configuration;
- stateful - properties that require a fully-constructed Config
instance to be validated and those values depend on other properties;

For the sake of simplicity, the scope of this task will be limited to
dealing with stateless properties only, but stateful validations are
also supported in the initial PR using property change listeners.

2. Property mutability.

There is no way of distinguishing which parts of a property are
mutable and which are not. This meta-information must be available at
runtime and as we discussed earlier the @Mutable annotation is added
to handle this.

3. Listening for property changes.

Some of the internal components e.g. CommitLog, may perform some
operations and/or calculations just before or just after the property
change. As long as JMX is the only API used to update configuration
properties, there is no problem. To address this issue the observer
pattern has been used to maintain the same behaviour.

4. SettingsTable input/output format.

JMX, SettingsTable and Yaml accept values in different formats which
may not be compatible in some of the cases especially when
representing composite objects. The former uses toString() as an
output, and the latter uses a yaml human-readable format.

So, in order to see the same properties in the same format through
different APIs, the Yaml representation is reused to display the
values and to parse a user input in case of update/set operations.

Although the output format between APIs matches in the vast majority
of cases here is the list of configuration properties that do not
match:
- memtable.configurations
- sstable_formats
- seed_provider.parameters
- data_file_directories

The test illustrates the problem:
https://github.com/apache/cassandra/pull/2334/files#diff-e94bb80f12622412fff9d87b58733e0549ba3024a54714516adc8bc70709933bR319


This could be a regression as the output format is changed, but this
seems to be the correct change to go with. We can keep it as is, or
create SettingsTableV2, which seems to be unlikely.

The examples of the format change:
---------------------------------------------
Property 'seed_provider.parameters' expected:
 {seeds=127.0.0.1:7012}
Property 'seed_provider.parameters' actual:
 seeds: 127.0.0.1:7012
---------------------------------------------
Property 'data_file_directories' expected:
 [Ljava.lang.String;@436813f3
Property 'data_file_directories' actual:
 [build/test/cassandra/data]
---------------------------------------------

On Mon, 1 May 2023 at 15:11, Maxim Muzafarov <mmuzaf@apache.org> wrote:
> 
> Hello everyone,
> 
> 
> I want to continue this topic and share another properties validation
> option/solution that emerged from my investigation of Cassandra and
> Accord configuration that could be used to make the virtual table
> SettingTable updatable, as each update must move Config from one
> consistent state to another. The solution is based on a few
> assumptions: we don't frequently update the running configuration, and
> we want to base a solution on established Cassandra validation
> approaches to minimise the impact on the configuration system layer
> and thus reuse what we already have.
> 
> Cassandra provides a number of methods to check the running
> configuration right after it is loaded from the yaml file. Here are
> some of them: DatabaseDescriptor#applySSTableFormats,
> DatabaseDescriptor#applySimpleConfig,
> DatabaseDescriptor#applyPartitioner etc. We can reuse them to perform
> consistent configuration updates, as long as applying them to a new
> configuration can guarantee us the correctness of the configuration.
> They could also help us to set the correct default values, calculated
> at runtime, when `null` is passed as an input (or `-1` in the case of
> JMX is used). For example, the `concurrent_compactors` has the
> following formula to calculate its default: min(8, max(2,
> min(getAvailableProcessors(), conf.data_file_directories.length))).
> This is unlikely to be achieved, regardless of any external validation
> framework we might use.
> 
> You can go directly to the code using the link below and see what the
> solution looks like, but I think I also need to provide the solution
> design with an ideal end state and some alternatives that were
> considered.
> https://github.com/apache/cassandra/pull/2300/files
> 
> 
> = The solution design (reuse methods) =
> 
> == Configuration Sources ==
> 
> To be able to reuse the methods applySSTableFormats, applySimpleConfig
> etc, we need to modify them slightly to pass new configuration changes
> for verification. Passing a new instance of the Config class to these
> methods to verify a single property on change seems very expensive as
> it requires a deep copy of the current configuration instance, so a
> good choice here is to create an intermediate interface layer -
> ConfigurationSource. Currently, applyXXX methods use direct access to
> the fields of the Config class instance, so having an intermediate
> interface will allow us to substitute a particular configuration
> property at the time of verification and re-run all checks against the
> substituted source.
> 
> In fact, all of the configuration options that can be used to
> configure Cassandra, such as system variables, environment variables
> or configuration properties, could be shaded through this interface.
> In the end, as the various property sources are implemented using the
> same interface, and share the same property names in different
> sources, we will be able to do sensible configuration overrides the
> same way the Spring does. For instance, read a property from sources
> in the following order: system properties, environment variables, and
> configuration yaml file.
> 
> The ConfigurationSource looks like a flattened source layer:
> 
> interface ConfigurationSource {
> <T> T get(Class<T> clazz, String name);
> String getString(String name);
> Integer getInteger(String name);
> Boolean getBoolean(String name);
> }
> 
> The ConfigurationSource shadowed the following configuration options
> in Cassandra:
> - the Config class source;
> - the CassandraRelevantProperties system properties source;
> - the CassandraRelevantEnv environment variables source;
> - other sub-project configurations dynamically added to the classpath;
> 
> 
> == Configuration Query ==
> 
> I have been delving through valuable Cassandra components and process
> managers such as StorageService, CommitLog, SnapshotManager etc. and
> found a lot of configuration usages doing some boilerplate variable
> transformations as well as running some component's actions
> before/after changing a configuration property. So this led me to
> create a wrapper around the ConfigurationSource to help read values
> and listen to changes.
> 
> Here are some usage examples:
> 
> ConfigurationQuery.from(tableSource)
> .getValue(DataStorageSpec.IntMebibytesBound.class,
> ConfigFields.REPAIR_SESSION_SPACE)
> .map(DataStorageSpec.IntMebibytesBound::toBytes)
> .listen(BEFORE_CHANGE, (oldValue, newValue) -> {
> assertNotNull(oldValue);
> assertNotNull(newValue);
> assertEquals(testValue.toBytes(), newValue.intValue());
> });
> 
> ConfigurationQuery.from(configSource, JMX_EXCEPTION_HANDLER)
> .getValue(Integer.class, ConfigFields.CONCURRENT_COMPACTORS)
> .listenOptional(ChangeEventType.BEFORE_CHANGE,
> (oldValue, newValue) -> newValue.ifPresent(
> CompactionManager.instance::setConcurrentCompactors));
> 
> 
> = Alternatives =
> 
> The least I want to do is reinvent the wheel, so the first thing I did
> was look at the configuration frameworks that might help us with the
> problems we are discussing.
> 
> Whatever framework we consider, the following things need to be taken
> into account:
> - We have custom configuration datatypes such as DataStorageSpec,
> DataStorageSpec;
> - We have custom DurationSpec, so we either move them to Duration,
> preserving backwards compatibility for all supported APIs (yaml, JMX),
> or extend a considered framework with new types, we have to provide
> data type converters in the latter case;
> - An additional dependency, so the key component (configuration) of
> the project becomes dependent on an external library version;
> - We have to deal with configuration defaults calculated during
> initialisation to maintain backward compatibility;
> 
> The frameworks I have looked at:
> - commons-configuration
> https://github.com/apache/commons-configuration
> - lightbend config
> https://github.com/lightbend/config
> - Netflix archaius
> https://github.com/Netflix/archaius
> 
> 
> The Apache Commons configuration from this list sounds less risky to
> us as we already have dependencies like commons-codec, commons-cli
> etc. The approach of how configuration fields are used in the
> Cassandra project is closer to the way the commons-configuration
> library maintains them, so we can replace the ConfigurationSource
> layer from the design with AbstractConfiguration
> (commons-configuration), keeping the same properties validation design
> concept.
> 
> The Apache Commons configuration provides Duration configuration types
> that look similar to the DurationSpec in Cassandra. Support/having
> both types in the case of we're going this library for the same
> abstraction confuses those who will be dealing with the configuration
> API in the internal code, so some kind of migration is still required
> here as well as creating custom adapters to support backwards
> compatibility. This is a HUGE change that helps to create an API for
> internal configuration usage for both Cassandra and sub-projects (e.g.
> Accord), but still does not solve the problem of availability of
> custom configuration datatypes (DataStorageSpec, DataStorageSpec) for
> sub-projects.
> 
> As a result of trying to implement commons-configuration as an
> internal API, I have come to the conclusion that the number of changes
> and compromises that need to be agreed upon will be very large in this
> case. So unless I'm missing something, the proposed design is our
> best.
> 
> 
> Thoughts?
> 
> On Thu, 30 Mar 2023 at 01:42, Maxim Muzafarov <mmuzaf@apache.org> wrote:
> > 
> > Hello everyone,
> > 
> > 
> > It seems to me that we need another consensus to make the
> > SettingsTable virtual table updatable. There is an issue with
> > validating configuration properties that blocks our implementation
> > with the virtual table.
> > 
> > A short example of validating the values loaded from the YAML file:
> > - the DurationSpec.LongMillisecondsBound itself requires input quantity >= 0;
> > - the read_request_timeout Config field with type
> > DurationSpec.LongMillisecondsBound requires quantity >=
> > LOWEST_ACCEPTED_TIMEOUT (10ms);
> > 
> > When the read_request_timeout field is modified using JMX, only a
> > DurationSpec.LongMillisecondsBound type validation is performed and
> > there is no LOWEST_ACCEPTED_TIMEOUT validation. If we implement the
> > SettingsTable properties validation in the same way, we just add
> > another discrepancy.
> > 
> > 
> > If we go a little deeper, we are currently validating a configuration
> > property in the following parts of the code, which makes things even
> > worse:
> > - in a property type itself if it's not primitive, e.g.
> > DataStorageSpec#validateQuantity;
> > - rarely in nested configuration classes e.g.
> > AuditLogOptions#validateCategories;
> > - during the configuration load from yaml-file for null, and non-null,
> > see YamlConfigurationLoader.PropertiesChecker#check;
> > - during applying the configuration, e.g. DatabaseDescriptor#applySimpleConfig;
> > - in DatabaseDescriptor setter methods e.g.
> > DatabaseDescriptor#setDenylistMaxKeysTotal;
> > - inside custom classes e.g. SSLFactory#validateSslContext;
> > - rarely inside JMX methods itself, e.g. StorageService#setRepairRpcTimeout;
> > 
> > 
> > To use the same validation path for configuration properties that are
> > going to be changed through SettingsTable, we need to arrange a common
> > validation process for each property to rely on, so that the
> > validation path will be the same regardless of the public interface
> > used (YAML, JMX, or Virtual Table).
> > 
> > In general, I'd like to avoid building a complex validation framework
> > for Cassandra's configuration, as the purpose of the project is not to
> > maintain the configuration itself, so the simpler the validation of
> > the properties will be, the easier the configuration will be to
> > maintain.
> > 
> > 
> > We might have the following options for building the validation
> > process, and each of them has its pros and cons:
> > 
> > == 1. ==
> > 
> > Add new annotations to build the property's validation rules (as it
> > was suggested by David)
> > @Max, @Min, @NotNull, @Size, @Nullable (already have this one), as
> > well as custom validators etc.
> > 
> > @Min(5.0) @Max(16.0)
> > public volatile double phi_convict_threshold = 8.0;
> > 
> > An example of such an approach is the Dropwizard Configuration library
> > (or Hibernate, Spring)
> > https://www.dropwizard.io/en/latest/manual/validation.html#annotations
> > 
> > 
> > == 2. ==
> > 
> > Add to the @Mutable the set (or single implementation) of validations
> > it performs, which is closer to what we have now.
> > As an alternative to having a single class for each constraint, we can
> > have an enumeration list that stores the same implementations.
> > 
> > public @interface Mutable {
> > Class<? extends ConfigurationConstraint<?>>[] constraints() default {};
> > }
> > 
> > public class NotNullConstraint implements ConfigurationConstraint<Object> {
> > public void validate(Object newValue) {
> > if (newValue == null)
> > throw new IllegalArgumentException("Value cannot be null");
> > }
> > }
> > 
> > public class PositiveConstraint implements ConfigurationConstraint<Object> {
> > public void validate(Object newValue) {
> > if (newValue instanceof Number && ((Number) newValue).intValue() <= 0)
> > throw new IllegalArgumentException("Value must be positive");
> > }
> > }
> > 
> > @Mutable(constraints = { NotNullConstraint.class, PositiveConstraint.class })
> > public volatile Integer concurrent_compactors;
> > 
> > Something similar is performed for Custom Constraints in Hibernate.
> > https://docs.jboss.org/hibernate/stable/validator/reference/en-US/html_single/#section-constraint-validator
> >  
> > 
> > == 3. ==
> > 
> > Enforce setter method names to match the corresponding property name.
> > This will allow us to call the setter method with reflection, and the
> > method itself will do all the validation we need.
> > 
> > public volatile int key_cache_keys_to_save;
> > 
> > public static void setKeyCacheKeysToSave(int keyCacheKeysToSave)
> > {
> > if (keyCacheKeysToSave < 0)
> > throw new IllegalArgumentException("key_cache_keys_to_save
> > must be positive");
> > conf.key_cache_keys_to_save = keyCacheKeysToSave;
> > }
> > 
> > 
> > I think the options above are the most interesting for us, but if you
> > have something more appropriate, please share. From my point of view,
> > option 2 is the most appropriate here, as it fits with everything we
> > have discussed in this thread. However, they are all fine to go with.
> > 
> > I'm looking forward to hearing your thoughts.
> > 
> > On Tue, 21 Feb 2023 at 22:06, Maxim Muzafarov <mmuzaf@apache.org> wrote:
> > > 
> > > Hello everyone,
> > > 
> > > 
> > > I would like to share and discuss the key point of the solution design
> > > with you before I finalise a pull request with tedious changes
> > > remaining so that we are all on the same page with the changes to the
> > > valuable Config class and its accessors.
> > > 
> > > Here is the issue I'm working on:
> > > "Allow UPDATE on settings virtual table to change running configurations".
> > > https://issues.apache.org/jira/browse/CASSANDRA-15254
> > > 
> > > Below is the restricted solution design at a very high level, all the
> > > details have been discussed in the related JIRA issue.
> > > 
> > > 
> > > = What we have now =
> > > 
> > > - We use JMX MBeans to mutate this runtime configuration during the
> > > node run or to view the configuration values. Some of the JMX MBean
> > > methods use camel case to match configuration field names;
> > > - We use the SettingsTable only to view configuration values at
> > > runtime, but we are not able to mutate the configuration through it;
> > > - We load the configuration from cassandra.yaml into the Config class
> > > instance during the node bootstrap (is accessed with
> > > DatabaseDescriptor, GuardrailsOptions);
> > > - The Config class itself has nested configurations such as
> > > ReplicaFilteringProtectionOptions (it is important to keep this always
> > > in mind);
> > > 
> > > 
> > > = What we want to achieve =
> > > 
> > > We want to use the SettingsTable virtual table to change the runtime
> > > configuration, as we do it now with JMX MBeans, and:
> > > - If the affected component is updated (or the component's logic is
> > > executed) before or after the property change, we want to keep this
> > > behaviour for the virtual table for the same configuration property;
> > > - We want to ensure consistency of such changes between the virtual
> > > table API and the JMX API used;
> > > 
> > > 
> > > = The main question =
> > > 
> > > To enable configuration management with the virtual table, we need to
> > > know the answer to the following key question:
> > > - How can we be sure to determine at runtime which of the properties
> > > we can change and which we can't?
> > > 
> > > 
> > > = Options for an answer to the question above =
> > > 
> > > 1. Rely on the volatile keyword in front of fields in the Config class;
> > > 
> > > I would say this is the most confusing option for me because it
> > > doesn't give us all the guarantees we need, and also:
> > > - We have no explicit control over what exactly we expose to a user.
> > > When we modify the JMX API, we're implementing a new method for the
> > > MBean, which in turn makes this action an explicit exposure;
> > > - The volatile keyword is not the only way to achieve thread safety,
> > > and looks strange for the public API design point;
> > > - A good example is the setEnableDropCompactStorage method, which
> > > changes the volatile field, but is only visible for testing purposes;
> > > 
> > > 2. Annotation-based exposition.
> > > 
> > > I have created Exposure(Exposure.Policy.READ_ONLY),
> > > Exposure(Exposure.Policy.READ_WRITE) annotations to mark all the
> > > configuration fields we are going to expose to the public API (JMX, as
> > > well as the SettingsTable) in the Config class. All the configuration
> > > fields (in the Config class and any nested classes) that we want to
> > > expose (and already are used by JMX) need to tag with an annotation of
> > > the appropriate type.
> > > 
> > > The most confusing thing here, apart from the number of tedious
> > > changes: we are using reflection to mutate configuration field values
> > > at runtime, which makes some of the fields look "unused" in the IDE.
> > > This can be not very pleasant for developers looking at the Config
> > > class for the first time.
> > > 
> > > You can find the PR related to this type of change here (only a few
> > > configuration fields have been annotated for the visibility of all
> > > changes):
> > > https://github.com/apache/cassandra/pull/2133/files
> > > 
> > > 
> > > 3. Enforce setter/getter method name rules by converting these methods
> > > in camel case to the field name with underscores.
> > > 
> > > To rely on setter methods, we need to enforce the naming rules of the
> > > setters. I have collected information about which field names match
> > > their camel case getter/setter methods:
> > > 
> > > total: 345
> > > setters: 109, missed 236
> > > volatile setters: 90, missed 255
> > > jmx setters: 35, missed 310
> > > getters: 139, missed 206
> > > volatile getters: 107, missed 238
> > > jmx getters: 63, missed 282
> > > 
> > > The most confusing part of this type of change is the number of
> > > changes in additional classes according to the calculation above and
> > > some difficulties with enforcing this rule for nested configuration
> > > classes.
> > > 
> > > Find out what this change is about here:
> > > https://github.com/apache/cassandra/pull/2172/files
> > > 
> > > 
> > > = Summary =
> > > 
> > > In summary, from my point of view, the annotation approach will be the
> > > most robust solution for us, so I'd like to continue with it. It also
> > > provides an easy way to extend the SettingTable with additional
> > > columns such as runtime type (READ_ONLY, READ_WRITE) and a description
> > > column. This ends up looking much more user-friendly.
> > > 
> > > Another advantage of the annotation approach is that we can rely on
> > > this annotation to generate dedicated dynamic JMX beans that only
> > > respond to node configuration management to avoid any inconsistencies
> > > like those mentioned here [2] (I have described a similar approach
> > > here [1], but for metrics). But all this is beyond the scope of the
> > > current changes.
> > > 
> > > Looking forward to your thoughts.
> > > 
> > > 
> > > [1] https://lists.apache.org/thread/26j9hhy39okw0wy79mtylb753w6xjclg
> > > [2] https://issues.apache.org/jira/browse/CASSANDRA-17734


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic