On Wed, Apr 15, 2009 at 10:34 AM, Sergiu Dumitriu <sergiu(a)xwiki.com> wrote:
Niels Mayer wrote: ...
Of course, the servlet spec allows streaming, and
there are
implementations of servlet-streaming that are quite straightforward to
implement, but may have performance/overhead issues anyways in real-use,
...
http://www.longtailvideo.com/support/forum/Modules/14410/Servlet-Streaming
Modern containers have support for NIO and long-lasting requests
(cometd). It's just a matter of configuration.
Yes, I gave the above example to show how streaming service can be achieved
in modern container.
I think this is classified as a religious debate.
In looking for something to backup my point, I found this article suggesting
the threads vs events performance payoff is only 5-10%: ( from
http://www.cs.toronto.edu/syslab/courses/csc2231/05au/lectures/lecture04.pdf)
Concurrency management
• A religious topic: threads vs. events
– Threads
• Easier to program
• Easy to understand and exploit parallelism (multi-proc)
– Events
• Easier to program
• Scheduling can be controlled and exploited
– Not hidden in the thread scheduler or lock
• Performance, scaling
• All this makes sense only…
– If the bottleneck is due to threads/events (unlikely)
Pipeline servers: L1/L2 cache
• Claim: instructions-per-cycle is low on servers
– Threads hurt l-cache performance
– Idea: re-architect software into computational stages
• Execute each task repetitively in a stage
• Problems:
– Quite a drastic change in architecture
– Working set size of stage must align well with l-cache size
– Performance pay-off is minimal
• 5-10% improvement (1 month of Moore’s law)
However, that doesn't include other issues of overhead that are magnified by
java and threads. And on a Unix server, it all ends up getting implemented
as a select() somewhere inside java anyways. Java just ends up being a
high-overhead wrapper to the bare-metal of the I/O mechanism on a server :-)
This, however, is clearly a "religious" POV.
Just consider the "object" and "thread" overhead needed to support
100, or
500 or 1000 concurrent long-downloads of streaming-media in java. Those are
not unheard of numbers for streaming media, even small-scale. The issues are
many when it comes to scaling: cache-nonlocality caused by sporadic bits of
code getting executed all-over the place in a giant memory image, the
unpredictable-time-response caused by garbage collections (or the additional
performance penalty of incremental gc), the cache-nonlocality caused by
garbage collections, etc.
JMF would just help on the streaming side, but there
are many other
tools that can do that. The hard part is the database, and the way we
handle attachments.
Assuming this is even what you want to be doing with Xwiki. I think Xwiki
should continue to do what it is doing and do it best.
I think a separate project with a separate architecture that already does
its job-best should be integrated w/ Xwiki to provide streaming service.
Independently, it would be great to improve Xwiki's attachment capabilities
so that in the future, they might prove worthy of using for streaming media
out of the database.Incremental uploading and the ability to actually store
100mb attachments would be a good start....
And streaming out of the database certainly *is* an interesting thing to do:
for example, being able to have an index of segments of video clips that one
could stream out of a database continuously and without
interruption/glitching between segments. Consider, however, that placing
large media datafiles directly into the database is problematic as it can
run up against system limits as well as practicality limits for
administration, backup, etc. Making use of existing work on media servers
where these issues have been solved and considered is a good idea.
-- Niels
http://nielsmayer.com
PS: another interesting view on the religious topic of threads/vs/events
for throughput
www.cs.cornell.edu/courses/cs614/2004sp/slides/CS614%20-%20concurrency.ppt
From Concurrency, Threads, and Events by Ken Birman
(Based on a slide set
prepared by Robbert van Renesse)
Events-based systems use fewer resources
Better performance (particularly scalability)
Event-based systems harder to program
Have to avoid blocking at all cost
Block-structured programming doesn’t work
How to do exception handling?
In both cases, tuning is difficult
In practice, many kinds of systems need to support both threads and events
Threaded programs in Unix are the common example of these, because window
systems use events
The programmer uses cthreads or pthreads
Major problem: the UNIX kernel interface wasn’t designed with threads in
mind!