'Re: Logging JEP'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openjdk-serviceability-dev
Subject:    Re: Logging JEP
From:       Fredrik Arvidsson <fredrik.arvidsson () oracle ! com>
Date:       2014-05-19 9:11:44
Message-ID: 5379CAD0.8090100 () oracle ! com
[Download RAW message or body]

Hi
Comments below.

/Fredrik

On 2014-05-19 00:55, Kirk Pepperdine wrote:
> > > > > My primary objection with the current specification is that the JEP is both \
> > > > > over-reaching and restrictive. It suggest taxonomy a closed hierarchical \
> > > > > taxonomy instead of a more open and flexible arrangement. The JEP \
> > > > > overreaches in that it defines or suggests extremely limited levels which \
> > > > > are all completely devoid of context or context that allow one to pull out \
> > > > > certain pieces of information in exclusion to other pieces of information. \
> > > > > It also allows for developers to make very arbitrary decisions as to what \
> > > > > to log at each “levels”. Where as PrintAdapativeSizingPolicy gives a very \
> > > > > clear indication as what will be logged, G1.logging=WARNING is… well... I \
> > > > > don’t know what is and is not suppose to be included. In addition, maybe \
> > > > > PrintAdaptiveSizingPolicy is not interesting but PrintTenuringDistribution \
> > > > > is. And lets say they are both at WARNING log level. Even better, lets say \
> > > > > that with CMS you get PrintTenuringDistribution at FINE and with G1 you get \
> > > > > it at FINEST. The point of all this is the logging framework is \
> > > > > complicating questions that are none of it’s concern. So I would suggest \
> > > > > that the logging framework get out of the business of trying to predefine \
> > > > > categories of logging, especially hierarchical categories of logging.
> > > > With the hierarchical logger design proposed in the JEP I cant see why you \
> > > > could not get the behaviour you're asking for. By targeting sub/child loggers \
> > > > it is possible to enable/disable arbitrary levels in the tree. And at the \
> > > > same time you could enable logging for lets say all GC related activity by \
> > > > enable the top most logger (root). It is possible to create sparse \
> > > > configurations including/excluding single loggers.
> > > I think that pub/sub is a much simpler model that is easier to understand than \
> > > what you’ve described.
> > I don't think the difference is that big. Enabling logging on a logger is \
> > essentially the same as subscribing to a topic.
> the only exception is that the JEP eliminates all possibilities besides the topics \
> that the JEP decided are “needed”.
The JEP does not specify or enforce any specific loggers. The loggers 
mentioned in the JEP are all just examples on how the naming scheme 
could look like. It is up to the developers to add loggers whenever they 
are needed. Adding a logger is as easy as adding one line of text in a 
header file.
> > > > The logger hierarchy does not need to be 'component oriented' even if I think \
> > > > it makes the most sense. I think you put far too much emphasis on log levels.
> > > Funny but I think the spec puts far too much emphasis on log levels.
> > Loggers and Adapters are the most important things in the JEP. Log levels are \
> > just to a way to select the verbosity of the output.
> So my argument has been, why not just use user defined topics to select the \
> verbosity? The topics will have semantic meaning where as these predefined levels \
> have none.
The log levels will only have a meaning in the sense of selecting 
verbosity of the output, not selecting the type of output. The 
-XX:PrintGCXXXX flags of today would correspond to a logger, not a log 
level.
> > > > Log levels should not be used to categorise information.
> > > In practice they do categorize information because there is no other practical \
> > > way to categorize it. I set verbosity to debug because I expect all debug \
> > > messages to show up. 
> > > > It should be used to select the verbosity of the log data output.
> > > I guess to understand this further I’d need to see an example of how it would \
> > > be used. For example, this is a single G1GC event 
> > > 1.047: [Full GC (System.gc()) 17M->3728K(13M), 0.0212570 secs]
> > > [Eden: 18.0M(24.0M)->0.0B(5120.0K) Survivors: 0.0B->0.0B Heap: \
> > > 17.2M(256.0M)->3728.7K(13.0M)] 
> > > How would you suggest that the JVM use the logging framework to log this \
> > > message?
> > This information should probably be logged in the 'info' level since the purpose \
> > of the information is to provide information. Maybe 'debug' level could be used \
> > to log even more information about how the numbers were computed.
> This information is logged at PrintGCDetails level in the 1.8 and at PrintGCDetails \
> PrintGCCause in the 1.7.0_45 and better. There is more detail available but that \
> comes to you via different flags. For example adaptive sizing and reference \
> processing. Are these both debug and if so how could I get one without getting the \
> other?
We could have 'gc_instrumentation_details', 'gc_instrumentation_cause' 
and 'gc_instrumentation_XXXX' loggers defined. All logging on 'info' 
level as default. Logging using 'debug' and 'trace' level could add even 
more details on each type of information if needed. Again, the GC 
logging is only one use of this logging framework, and it happens to be 
a bit special. More like a form of instrumentation than classic logging. 
There are many other parts of the JVM that needs logging in a more 
traditional way.
> 
> > > > We have only five levels of verbosity in the JEP (plus disabled). I think \
> > > > that is a well balanced number.
> > > I don’t know how to respond to this without risking sounding inflammatory and \
> > > I’d really like not like to be inflammatory. ;-)
> > Well, you just did ;)
> > > > By pre defining all available loggers(categories) in the source code we \
> > > > actually give the user the possibility to enumerate all those, and see what \
> > > > different type of information that are in there.
> > > Why do you need to know levels prior to iterating? Pub/Sub does not come with \
> > > this requirement.
> > If you are not to get a lot of noise/unwanted information you would have to know \
> > what topics to subscribe to. I can't really see the difference here. its like \
> > saying that you want news, and then start subscribing to all newspapers. You will \
> > get news, but a lot of it you're not interested in.
> What I’m saying is that of course one is going to have to know what to subscribe \
> to. As it is one needs to know which flags should be set if you need to get a \
> information from the JVM. If I subscribe to the wrong topics then for sure I’ll get \
> information that you don’t want or need. However if I happen to subscribe to debug, \
> I’m very certain to get information that I neither want nor need. In either case it \
> will not relieve the developer to document what is being logged so the consumer \
> understand what is being produced.
Agreed. Documentation must still be there. But having the system knowing 
what topics there are in runtime will open up for some really nice 
features I think. The ability to enumerate loggers/topics can be very 
useful.
> > > > On top of that we get compile time checks of that its valid.
> > > > Your proposal would require any module implementer to carefully document all \
> > > > 'tags' used in some place in some way. The latter would not make it easier \
> > > > for users, it would make it even worse than it is today.
> > > What I read here, and maybe I’m reading it wrong, is that the log levels are \
> > > used to categorize messages and that simply using your predefined tags \
> > > alleviates developers from carefully documenting the information that is \
> > > included with each tag/level. This is exactly one of the bigger problems that I \
> > > see in the field. Developers drop things into a log level and they don’t \
> > > document what comes with each level. You can’t relieve developers of the \
> > > responsibility of documenting what will happen when I turn the logging level to \
> > > debug (or some other level).
> > Log levels aren't used to categorize anything They are used to select verbosity \
> > of the output. Loggers are what is used to categorize the information/messages. \
> > Most of the GC logging discussed would as I see it end up in the 'info' log level \
> > in a couple of loggers, maybe corrsponding todays -XX:+PrintXXX. Debug logging \
> > and even more verbose information would be put in the debug and trace level. The \
> > latter two is nothing normally used to get GC data/instrumentation into log \
> > files.
> > > > > Instead I believe it is more useful for the framework to support more of a \
> > > > > pub/sub model where each of the frameworks or components in the JVM are \
> > > > > free to devise a logging scheme or taxonomy that makes sense for that \
> > > > > component.
> > > > This would not be something targeted by this JEP. The JEP title is: *Unified* \
> > > > JVM Logging. what you describe is similar to what we have today. All modules \
> > > > have their own way of doing logging. This whole discussion is about \
> > > > addressing that problem, not making it a rule. I have gotten heaps of \
> > > > feedback from developers asking me to bring an end to this arbitrary 'log as \
> > > > you like' situation they live in today.
> > > I’m not advocating what is in the JVM today. Unifying the logging scheme makes \
> > > sense. It allows for better standardization of tooling, just like we have with \
> > > JMS client technology.
> > Well if we leave it up to each module to devise a logging scheme and the taxonomy \
> > for its logs, in what way would this make the logging more unified?
> It would provide a common framework for logging. For example, GC logging can be \
> directed to a log file where JIT information can only be written to stdout. One of \
> the tasks would be to allow me to pump the information to a file, stdout, a JMX \
> client or anything else that I might see fit to write an adaptor for. The \
> information should come to the sink point in a coherent manner. IOWs, what we see \
> in GC logging is that incomplete messages are delivered to the sink and then \
> another event interrupts, that event gets pushed out, either in it’s entirety or \
> maybe not and then maybe you get the terminating information from the interrupted \
> event. I would expect the common framework to help prevent this type of \
> “corruption” from occurring. 
Much of what you mention here I would class as bugs. Fixing disrupted 
and fragmented log output due to concurrency issues isn't the main 
purpose of this JEP, its just one small part of it.
The concept of Adapters mentioned in the JEP opens up for developers to 
add more ways of getting the data out from the JVM. The initial goal is 
to provide console logging and logging to file. I can see more Adapters 
further down the road. Syslog adapter for *nix systems and an EventLog 
adapter for Windows are natural candidates. Adapters for JMX, JFR and 
sockets might be added later.
> > > > > For example, Twitter provides a communication platform. By convention we \
> > > > > all use @ and # to specify targets and subjects. Imagine if Twitter limited \
> > > > > all # tags to be Fine, Finer and Finest. It’s a thought that is so crazy \
> > > > > you wouldn’t even consider it. Twitter does *not* try to define what those \
> > > > > targets and subjects are as there is not reasonable way in which it can. So \
> > > > > it’s not an accident that these decisions are left up to the users to \
> > > > > decide. I’m not suggesting the JVM logging framework copy Twitter. I’m \
> > > > > merely  using it as an example of an open and flexible logging system.
> > > > Again I think you are confusing log levels (verbosity) with loggers. Log \
> > > > levels are not depicting information topics, or the actual source of the \
> > > > information, log levels should be used to filter the amount of information \
> > > > about one topic.
> > > Again, rightly or wrongly this is the way that loggers get used because \
> > > categorization is not part of the design.
> > If we leave the log levels out for a moment. I cant see the real difference \
> > between using loggers and @/# to categorise the logged information. I only see \
> > positive things ordering the information in a hierarchy where possible and \
> > desirable.
> This isn’t rallying against hierarchies as I see positive things in them as well. \
> That said, a topic pub/sub based system would not prevent you from devising the \
> scheme should that be appropriate where as a non-hierarchical based system cannot \
> be fitted on top of a hierarchical one. Since neither of us know how logging will \
> evolve in the future it seems to me that moving to the more flexible scheme is \
> making it more likely to be adaptive to what ever that change looks like. 
There is nothing in the JEP that forces development to add loggers in 
hierarchy. It is just a possibility to be used where it makes sense.
> > > I know Log4J is very well liked until you get bit in a$$ by it.. and I run into \
> > > projects on a regular basis that have gotten bitten very hard. I include an \
> > > entire section of my performance tuning course to the subject and I have to say \
> > > that a number of organizations have changed how they log as a result. BTW, \
> > > being in the majority just means you’re in the majority, it doesn’t mean the \
> > > majority is correct. I shan’t bore you with historical evidence to that point \
> > > ;-)
> > Sure, you're probably right about some cases suffer from this design. Question is \
> > if the JVM logging would be better off with something radically different from \
> > the design I present in this JEP? We are not aiming to be non-intrusive with this \
> > framework. There are other more viable solutions to that problem.
> I believe the JVM would be better off with a different design which is why I’m \
> making arguments to at least explore the alternative to see what they might look \
> like. I’d also like to bring others into the discussion that have experiences here \
> and may not normally engage in discussions on JVM forums. For example Neal Ford has \
> some excellent talks w.r.t creating taxonomies for systems. I posted a link to one \
> of his talks (JFokus edition I believe) when the JEP was first released. I don’t \
> really want to call out other people because they may not have the time at this \
> moment to participate. However I hope they do inject their expertise in this area \
> because we’re going to have to live with what comes out of these discussions for a \
> very long time to come.
Agreed, the long term impact is the main reason for me to have this JEP 
discussion in the open.
> 
> > > Agreed, I don’t believe this JEP should be in the business of specifying \
> > > payload..  but the text in the JEP would seem to prohibit it.
> > That should be fixed in the JEP text. It should say something like that in a \
> > first step human readable data is the goal.
> But why specify anything about payload at all? If you feel the need to add things \
> shouldn’t that be contained in a header that would be separate from the payload?
The type of payload used when logging impacts the actual data types in 
the API. In the current API we are implementing a scheme similar to 
printf() since it is a well known, and liked way of formatting information.
The JEP tries to set up goals to be completed, so it has to put 
restrictions on some things to be able to reach all the way. If needed 
we could add support for binary data in the API, but I still haven't 
seen any use case needing it.

Cheers
/Fredrik


[prev in list] [next in list] [prev in thread] [next in thread]
Configure | About | News | Add a list | Sponsored by KoreLogic