On 15 Apr 2016, at 11:00, Eduard Moraru
<enygma2002(a)gmail.com> wrote:
Hi,
Re http://<server>/<context>/tmp/<module id>/<serialized owner
document
reference>/<module-dependent resource path>
1. What determines this <module id>? What about collisions between
extensions not knowing about eachother?
Not a real problem. Ofc each extension needs a unique name. They can use
groupid:artifactif if they want. This is at the level of best practice IMO.
FTR module id is what we currently use.
2. Are all these temporary resources bound to a
document? What about the
ones that are not? They would use some random doc reference there? Would
that matter?
Right now it’s the case for all our temporary resources. It doesn’t matter if the doc
exists or not. This can be just be considered as a unique ref inside the module.
If we want to be generic I could just store it as a String in TemporaryResourceReference
and have the module using it decide what it represents and do the resolving of it into a
Java object (DocumentReference or something else). That would make is fully generic if we
want that. Make the module using it a bit more complex (since they need to inject some
resolver) but we could provide some helpers probably.
As far as I see it, after
http://<server>/<context>/tmp/<module id>/ we can
have pretty much anything (i.e. <module-dependent resource path>),
depending on the app's needs.
Alternatively, this being a generic service, maybe we can assign a random
ID on request and keep a mapping in a separate table.
That’s way too complex IMO. And for a mapping you’d still need to have the module id in
that table anyway to be ;)
A module needs to be able to construct the URL from its discrete data.
We could even drop
the <module id> from the URL and store it in the table instead (as a
column). The idea would be that we would not need to expose all these
"coordinates" in the actual URL and pollute it with the logic of each
individual module, but only use a generic random (unique) ID since it`s a
temporary resource anyway. Also, being a separate DB table, the data will
not be available on an export since it`s temporary anyway.
What is left, IMO, is a way to clean this temporary data. On the DB table
side, we can always clean the table data on startup and the URLs will be
gone, however we would need a way to also actually delete the corresponding
temporary files and that may be dependant on the module`s logic.
Anyway, WDYT about this direction so far, with using simply
http://<server>/<context>/tmp/<resource
id>?
I don’t like it for a few reasons:
* It makes the URL obscure. Right now I’v voluntarily proposed a scheme where you can
still see the resource asked in the URL. Being obscure makes it really hard for debugging
issues and understanding what’s happening
* It’s too complex and fragile (you depend on the DB and you need the DB in sync with your
FS).
* You need a unique id generator based on discrete data which cannot be guaranteed
Don’t forget that each module using the temporary resource reference need to:
* Generate a temporary URL based on discrete elements (in the case I’ve shown: attachment
name, current doc name, generated image name, etc)
* Generate a temporary file on the Filesystem. If you put everything under the same
directory then
** a) you face the OS limitations of the number of files in a single directory
** b) data is not neatly organized in subdirectories which makes it hard to know what is
what when you debug
So far I really prefer the scheme I proposed.
Thanks
-Vincent
Thanks,
Eduard
On Fri, Apr 15, 2016 at 11:30 AM, Marius Dumitru Florea <
mariusdumitru.florea(a)xwiki.com> wrote:
> On Thu, Apr 14, 2016 at 7:46 PM, Vincent Massol <vincent(a)massol.net>
> wrote:
>
>>
>>> On 14 Apr 2016, at 16:52, Marius Dumitru Florea <
>> mariusdumitru.florea(a)xwiki.com> wrote:
>>>
>>> On Thu, Apr 14, 2016 at 5:43 PM, Vincent Massol <vincent(a)massol.net>
>> wrote:
>>>
>>>> Hi devs,
>>>>
>>>> I’m implementing
http://jira.xwiki.org/browse/XWIKI-10375
("Refactor
>> the
>>>> temporary resource concept inside the Resource module”) and I need to
>>>> define a URL format for the new “tmp” resource type.
>>>>
>>>> I’m proposing the following:
>>>>
>>>>
>>>
>>>> http://<server>/<context>/tmp/<module
id>/<serialized owner document
>>>> reference>/<module-dependent resource path>
>>>>
>>>
>>> Serialized document reference uses backslash to escape special
> characters
>>> which breaks the URL in Tomcat for security reasons.
>>
>>
>
>> Yes but the same is true whether you have “A\.B.C” or "/A\.B/C”.
>>
>
> WDYM? The dot is escaped in the space name with a backslash only when the
> space name is serialized as a reference, which is not the case for the
> standard wiki page URL /xwiki/bin/view/Space.With.Dot/Page.With.Dot
>
> Having a slash or a backslash in the space or page name is less common than
> having a dot ("Version 1.2"). And the user might be willing to accept that
> having a backslash in the page (or attachment's) name can cause security
> issues with Tomcat, but I doubt he will accept to avoid dots.
>
>
>> That’s not a blocking issue anyway since we can easily transform them
> into
>> other characters when we serialized and do the opposite when we parse the
>> URL.
>>
>>> This is based on the existing TemporaryResourceReference at:
>>>>
>>>>
>>
>
https://github.com/xwiki/xwiki-platform/blob/96caad053c14fc5546e9bc141bc284…
>>>>
>>>> For example:
>>>>
>>>> http://
>>>>
>>
>
<server>/<context>/tmp/officeviewer/A.B.WebHome/Q29tcGFueSBQcmVzZW50YXRpb24ucHB0/Company+Presentation-slide0.jpg
>>>>
>>>> Note that in this example from the officeviewer macro the
>> module-dependent
>>>> resource path consists in:
>>>>
>>>
>>>
>>>> - base64(name of office attachment + hashcode(parameters))
>>>>
>>>
>>> See
http://jira.xwiki.org/browse/XWIKI-11528 for the rationale behind
>> it. I
>>> was trying to avoid backslash (from the serialized attachment
> reference)
>> in
>>> the URL.
>>
>>
>
>> Yes. However the image name “Company Presentation-slide0” could also
>> contain slash or backlashes too.
>>
>
> It could but it's less common, especially because most Operating Systems
> are not very friendly with these characters when used in file or folder
> names.
>
>
>>
>> Note that I wasn’t sure why you you didn’t compute the base64 of both the
>> name of attachment + the parameters instead of having 2 directory levels
>> consisting in the base64 of the attachment name + the hashcode of the
>> parameters as different path segments. Need to check XWIKI-11528, maybe
>> it’s there.
>>
>> IMO we need to treat all path segments in the same way and convert slash
>> and backslash into some other characters. I’m not sure we need the base64
>> solution. But anyway this is an implementation detail of the officeviewer
>> module and not really related to the discussion of the generic Temporary
>> URL format.
>>
>> Thanks
>> -Vincent
>>
>>> - generated image name from PPT
>>>>
>>>> In this case, the implementation would generate the following file:
>>>>
>>>>
>>>>
>>
>
[TMPDIR]/officeviewer/A/B/WebHome/Q29tcGFueSBQcmVzZW50YXRpb24ucHB0/Company+Presentation-slide0.jpg
>>>>
>>>> WDYT?
>>>>
>>>> Thanks
>>>> -Vincent