[prev in list] [next in list] [prev in thread] [next in thread]
List: cassandra-dev
Subject: Re: [DISCUSSION] Next release roadmap
From: Benedict Elliott Smith <benedict () apache ! org>
Date: 2021-04-26 17:14:38
Message-ID: AECE111D-AED6-4918-B3CE-B0B6D200DFC6 () apache ! org
[Download RAW message or body]
I think my earlier response vanished into the moderator queue. Just a few comments:
1) The Paxos latency (and correctness) improvements I think should land in 4.0.x, as \
we have introduced a fairly significant regression and this work mostly resolves \
outstanding issues with LWTs today. 2) If we aim to deliver multi-partition LWTs in \
4.x/5.0, we may likely want to pair this with work to further reduce latency beyond \
the above work, as contention will become a more significant problem. Should I be \
involved in delivering multi-partition LWTs I will also be aiming to deliver even \
lower latencies for the release they land in. 3) To support all of the above work, I \
also aim to deliver a Simulator facility for deterministically executing cluster \
workloads under adversarial scheduling (i.e. that intercepts all message and thread \
events and evaluates them sequentially, in pseudorandom order), alongside \
linearizability verification built upon this. This work will include (or have as a \
prerequisite) significant clean-ups to internal functionality like executors, use of \
futures and other concurrency primitives, and mocking out of time and the filesystem.
On 23/04/2021, 14:46, "Benjamin Lerer" <b.lerer@gmail.com> wrote:
Hi everybody,
Thanks for all the responses. I went through the emails and aggregated the
proposals to give us an idea on where we stand at this point.
I only included the improvements in the list and left on the side the bug
fixes.
Regarding bug fixes, I wonder if we should not have discussions every month
to discuss what are the important issues that should be fixed in priority.
I feel that we sometimes tend to forget old issues even if they are more
important than some new ones.
Do not hesitate to tell me if I missed something or misinterpreted some
proposal.
*Query side improvements:*
* Storage Attached Index or SAI. The CEP can be found at
https://cwiki.apache.org/confluence/display/CASSANDRA/CEP-7%3A+Storage+Attached+Index
* Add support for OR predicates in the CQL where clause
* Allow to aggregate by time intervals (CASSANDRA-11871) and allow UDFs
in GROUP BY clause
* Ability to read the TTL and WRITE TIME of an element in a collection
(CASSANDRA-8877)
* Multi-Partition LWTs
* Materialized views hardening: Addressing the different Materialized
Views issues (see CASSANDRA-15921 and [1] for some of the work involved)
*Security improvements:*
* SSTables encryption (CASSANDRA-9633)
* Add support for Dynamic Data Masking (CEP pending)
* Allow the creation of roles that have the ability to assign arbitrary
privileges, or scoped privileges without also granting those roles access
to database objects.
* Filter rows from system and system_schema based on users permissions
(CASSANDRA-15871)
*Performance improvements:*
* Trie-based index format (CEP pending)
* Trie-based memtables (CEP pending)
* Paxos improvements: Paxos / LWT implementation that would enable the
database to serve serial writes with two round-trips and serial reads with
one round-trip in the uncontended case
*Safety/Usability improvements:*
* Guardrails. The CEP can be found at
https://cwiki.apache.org/confluence/display/CASSANDRA/%28DRAFT%29+-+CEP-3%3A+Guardrails
* Add ability to track state in repair (CASSANDRA-15399)
* Repair coordinator improvements (CASSANDRA-15399)
* Make incremental backup configurable per keyspace and table
(CASSANDRA-15402)
* Add ability to blacklist a CQL partition so all requests are ignored
(CASSANDRA-12106)
* Add default and required keyspace replication options (CASSANDRA-14557)
* Transactional Cluster Metadata: Use of transactions to propagate
cluster metadata
* Downgrade-ability: Ability to downgrade to downgrade in the event that
a serious issue has been identified
*Pluggability improvements:*
* Pluggable schema manager (CEP pending)
* Pluggable filesystem (CEP pending)
* Pluggable authenticator for CQLSH (CASSANDRA-16456). A CEP draft can be
found at
https://docs.google.com/document/d/1_G-OZCAEmDyuQuAN2wQUYUtZBEJpMkHWnkYELLhqvKc/edit
* Memtable API (CEP pending). The goal being to allow improvements such
as CASSANDRA-13981 to be easily plugged into Cassandra
*Memtable pluggable implementation:*
* Enable Cassandra for Persistent Memory (CASSANDRA-13981)
*Other tools:*
* CQL compatibility test suite (CEP pending)
Le jeu. 22 avr. 2021 16:11, Benjamin Lerer <b.lerer@gmail.com> a écrit :
> Finally, I think it's important we work to maintain trunk in a shippable
>> state.
>
>
> I am +100 on this. Bringing Cassandra to such a state was a huge effort
> and keeping it that way will help us to ensure the quality of the
> releases.
>
> Le jeu. 15 avr. 2021 17:30, Scott Andreas <scott@paradoxica.net> a
> écrit :
>
>> Thanks for starting this discussion, Benjamin!
>>
>> I share others' enthusiasm on this thread for improvements to secondary
>> indexes, trie-based partition indexes, guardrails, and encryption at rest.
>>
>> Here are some other post-4.0 areas for investment that have been on my
>> mind:
>>
>> – Transactional Cluster Metadata
>> Migrating from optimistic modification and propagation of cluster
>> metadata via gossip to a transactional implementation opens a lot of
>> possibilities. Token movements and instance replacements get safer and
>> faster. Schema changes can be made atomic, enabling users to execute DDL
>> rapidly without waiting for convergence. Operations like expansions and
>> shrinks become easier to automate with less care and feeding.
>>
>> – Paxos improvements
>> During discussion on C-12126, Benedict expressed interest in post-4.0
>> improvements that can be made to Cassandra's Paxos / LWT implementation
>> that would enable the database to serve serial writes with two round-trips
>> and serial reads with one round-trip in the uncontended case. For many
>> cross-WAN serial use cases, this may halve the latency of CAS queries.
>>
>> – Multi-Partition LWTs
>> LWT is a great primitive, but modeling applications with the constraint
>> of single-key CAS can be a game of Twister. Extending the paxos
>> improvements discussed above to enable multi-partition CAS would enable
>> users of Apache Cassandra to perform serial operations across partition
>> boundaries.
>>
>> – Downgrade-ability
>> I also see "downgradeability" as important to future new release
>> adoption. Taking file format changes as an example, it's currently not
>> possible to downgrade in the event that a serious issue has been identified
>> – unless you're able to host-replace yourself out after upgrading one
>> replica, or revert to a pre-upgrade snapshot and accept data loss. It would
>> be excellent if it were possible for v.next to continue writing the
>> previous SSTable/commitlog/hint/etc. format until a switch is flipped to
>> opt into new file formats. Apache HDFS takes a similar approach, enabling
>> downgrade until NameNode metadata is finalized [1]. This would be an
>> excellent capability to have in Apache Cassandra, and dramatically lower
>> the stakes for new release adoption.
>>
>> On pluggability / disaggregation:
>> I agree that these are important themes. We'll want to bring a lot of
>> care and attention to this work. Disaggregation can open a lot of
>> possibilities - with the drawback of future changes being restricted to the
>> defined interface and an inability to optimize across interface boundaries.
>> We can probably hit a sweet spot, though.
>>
>> Toolchains to validate implementations of pluggable components will
>> become very important. It would be bad for the project's users if bundled
>> implementations were of uneven quality or supported subsets of
>> functionality. Converging on a common validation toolchain for pluggable
>> subsystems can help us ensure that quality while minimizing the effort
>> required to test new implementations.
>>
>> Finally, I think it's important we work to maintain trunk in a shippable
>> state. This might look like major changes and new features hiding behind
>> feature flags that enable users to selectively enable them as development
>> and validation proceeds, with new code executed regardless of the flag held
>> to a higher standard.
>>
>> Cheers,
>>
>> – Scott
>>
>> [1]
>> https://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsRollingUpgrade.html
>>
>>
>> ________________________________________
>> From: guo Maxwell <cclive1601@gmail.com>
>> Sent: Wednesday, April 14, 2021 10:25 PM
>> To: dev@cassandra.apache.org
>> Subject: Re: [DISCUSSION] Next release roadmap
>>
>> +1
>>
>> Brandon Williams <driftx@gmail.com> 于2021年4月15日周四 \
上午4:48写道: >>
>> > Agreed. Everyone just please keep in mind this thread is for roadmap
>> > contributions you plan to make, not contributions you would like to
>> > see.
>> >
>> > On Wed, Apr 14, 2021 at 3:45 PM Nate McCall <zznate.m@gmail.com> wrote:
>> > >
>> > > Agree with Stefan 100% on this. We need to move towards pluggability.
>> Our
>> > > users are asking for it, it makes sense architecturally, and people
>> are
>> > > doing it anyway.
>> > >
>> > >
>> > > ...
>> > > > for me definitely
>> https://issues.apache.org/jira/browse/CASSANDRA-9633
>> > > >
>> > > > I am surprised nobody mentioned this in the previous answers, there
>> is
>> > > > ~50 people waiting for it to happen and multiple people working on
>> it
>> > > > seriously and wanting that feature to be there for so so long.
>> > > > ...
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>> > For additional commands, e-mail: dev-help@cassandra.apache.org
>> >
>> >
>>
>> --
>> you are the apple of my eye !
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
>> For additional commands, e-mail: dev-help@cassandra.apache.org
>>
>>
---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@cassandra.apache.org
For additional commands, e-mail: dev-help@cassandra.apache.org
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic